Closed RyanHuangNLP closed 4 years ago
could you release the finetune performance with fewer epoch on glue, such as epoch 3 the same as bert or roberta?
Using Roberta hyper-parameter to fine-tune the final checkpoint can achieve:
could you release the finetune performance with fewer epoch on glue, such as epoch 3 the same as bert or roberta?