richarddwang / electra_pytorch

Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
324 stars 41 forks source link

Relative importance of different "tricks" in README #35

Closed vmurahari3 closed 2 years ago

vmurahari3 commented 2 years ago

Thank you for the brilliant repository! You list some important tricks which are necessary to reproduce performance. In your experience, which of those tricks are critical for matching the released GLUE scores?

On a cursory glance. seems like reordering the sentences to augment the dataset for STS tasks seems to be a critical detail. I was wondering if that also aligns with your experience in running these models?

richarddwang commented 2 years ago

I didn't do ablation study for each point so I can't tell you anything for sure.

In my experience, the data augmentation for fine-tuning STS and MRPC often provides slightly better performance.