Difference in hyper-parameters

PhMeier commented 2 years ago

Hello,

thank you very much for you work and providing the code!

While comparing the fine-tuning scripts to your reported hyper-parameters in your paper I have seen some differences:

The sequence length for generation is 512 in the paper, in both amr2text scripts the parameter "src_block_size" is 1024.
Early stop is 5 in the paper, in the scripts it ranges from 10 to 15.
The learning rate in finetune_AMRbart_amr2text.sh differs from the reported 1e-5 for the base model.

I guess the parameters from the scripts are more recent?

goodbai-nlp commented 2 years ago

Hi, thanks for pointing this out. For learning rate, that's a typo and we will update the paper accordingly. We provide a bigger value for sequence length and early stopping patience in scripts because they are helpful for learning longer (> 512) AMR sequences (although only ~0.03% in corpus) and selecting better models. We believe these changes are beneficial for other researchers to achieve comparable or better results than our paper, especially under a different experimental environment.

PhMeier commented 2 years ago

Hi muyeby, thank you very much for explaining!

goodbai-nlp / AMRBART

Difference in hyper-parameters #4