TonyNemo / UBAR-MultiWOZ

AAAI 2021: "UBAR: Towards Fully End-to-End Task-Oriented Dialog System with GPT-2"
96 stars 25 forks source link

Parameters for the best tuned model "experiments/all_0729_sd11_lr0.0001_bs2_ga16/epoch43_trloss0.56_gpt2" #5

Open KristenZHANG opened 3 years ago

KristenZHANG commented 3 years ago

Hi,

I notice that according to the naming code in the project, your best model's ("experiments/all_0729_sd11_lr0.0001_bs2_ga16/epoch43_trloss0.56_gpt2") should set parameters seed=11, lr=1e-4, batch_size=2 and gradient_accumulation_steps=16.

I am trying to train the model using the command provided in README: _python train.py -mode train -cfg gpt_path=distilgpt2 lr=1e-4 warmup_steps=2000 gradient_accumulation_steps=16 batch_size=2 epoch_num=60 exp_no=bestmodel

However, I cannot reproduce the best tuned model where I trained model with epoch43 has trloss=0.59 not 0.56. Therefore I am wondering whether there are some parameters that are set differently during training.

Thanks!