Hyper Parameter tuning?

microsoft / CodeBERT

CodeBERT

MIT License

2.25k stars 455 forks source link

Closed mosh98 closed 3 years ago

mosh98 commented 3 years ago

Hi, regarding fine tuning on CodeBERT,

What parameters should i choose to experiment with?

Should i follow the usual hyper parameters suggested on the original BERT and RoBERTa paper?

Asking because the codeBERT paper doesn't specify learning rate ranges.

Warm regards.

guoday commented 3 years ago

You can follow the hyper parameters in this repo for fair comparison. CodeBERT usually uses 2e-5 or 5e-5 learning rate to fintune.

mosh98 commented 3 years ago

alright cheers