Open yxgnahz opened 2 years ago
save issue, did you find a way out?
I think that Xinyun has found the right configuration to reproduce the results. To debug it, please try the following
@airsplay Thanks for your reply. I have decreased the learning rate from 1e-4 to 5e-5, and then the results are correct on MNLI.
@TobiasLee Thanks for checking it! Just for clarification, you mean increase 1e-4 to 5e-4 or decrease 1e-4 to 5e-5?
Ooops, I made a typo. The original LR used in the paper is 1e-4 and I decreased it to 5e-5 actually for stable results on MNLI.
@TobiasLee Thanks for checking it! Just for clarification, you mean increase 1e-4 to 5e-4 or decrease 1e-4 to 5e-5?
Hi, thanks for your interesting work. I met a problem when I tried to finetune the model. I loaded the released pretrained model BERT_base model, and finetuned it on GLUE using the given finetuning scripts, I got only 69.08 on QQP and 31.82 on MNLI. Therefore, I wondered (1) Is the GLUE performance reported in the paper exactly the performance after three-epoch finetuning or you just picked up the highest during finetuning? (2) For the pretrained model, did you just use the model at the last iteration or you picked up one during the pretraining process? Thanks in advance.