Open yunjinchoidev opened 1 year ago
[yunjinchoidev]
program: train.py
method: bayes
metric:
goal: minimize
name: train_loss
parameters:
learning_rate:
max: 0.1
min: 0.0001
distribution: uniform
model_name:
values:
- jhgan/ko-sbert-sts
- sentence-transformers/xlm-r-large-en-ko-nli-ststb
distribution: categorical
batch_size:
values: [8, 16, 32, 64]
distribution: categorical
program: train.py
method: bayes
metric:
goal: minimize
name: train_loss
parameters:
learning_rate:
max: 0.1
min: 0.0001
distribution: uniform
model_name:
values:
- klue/roberta-small
- klue/roberta-large
- ys7yoo/sentence-roberta-large-kor-sts
distribution: categorical
batch_size:
values: [8, 16, 32, 64]
distribution: categorical
[yunjinchoidev]
program: train.py
method: bayes
metric:
goal: maximize
name: val_pearson
parameters:
learning_rate:
max: 0.1
min: 0.00001
distribution: uniform
model_name:
values:
- jhgan/ko-sbert-sts
- sentence-transformers/xlm-r-large-en-ko-nli-ststb
distribution: categorical
batch_size:
values: [8, 16, 32, 64]
distribution: categorical
program: train.py
method: bayes
metric:
goal: maximize
name: val_pearson
parameters:
learning_rate:
max: 0.1
min: 0.00001
distribution: uniform
model_name:
values:
- klue/roberta-small
- klue/roberta-large
- ys7yoo/sentence-roberta-large-kor-sts
distribution: categorical
batch_size:
values: [8, 16, 32, 64]
distribution: categorical
[lectura7942] val_pearson이 NAN으로 가는 런타임이 있어서 early stopping을 적용해야 할 것 같습니다. 저는 일단 val_pearson이 더 이상 커지지 않는 epoch이 3번 있으면 멈추도록 했습니다.
from pytorch_lightning.callbacks import EarlyStopping
early_stop_callback = EarlyStopping(monitor="val_pearson", min_delta=0.00, patience=3, verbose=False, mode="max")
# min_delta: 변화 <= min_delta 이면 변화 없는 것으로 간주
# patience: 해당 횟수만큼 변화 없으면 멈춤
# mode: max - 커지는 변화가 있는 metric을 보는 중
trainer = pl.Trainer(accelerator='gpu', max_epochs=args.max_epoch, log_every_n_steps=1, logger=wandb_logger, callbacks=[early_stop_callback])
[yunjinchoidev]
learning rate 범위가 너무 넓어서 NAN 이 너무 많이 나오고 sweep 경우가 너무 많아서 min max 범위를 다음과 같이 수정함.
learning_rate:
max: 0.0001
min: 0.000001
distribution: uniform
[yunjinchoidev]
Sweep Configuration 작성
아래 코드 해석 :
명령어 치면 자동으로 sweep 이 돌아간다.
데이터셋 100개로 일단 해봤습니다. 아래처럼 나오네요.