microsoft / MPNet

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf
MIT License
286 stars 33 forks source link

could you release the finetune performance with fewer epoch on glue? #1

Closed RyanHuangNLP closed 4 years ago

RyanHuangNLP commented 4 years ago

could you release the finetune performance with fewer epoch on glue, such as epoch 3 the same as bert or roberta?

StillKeepTry commented 4 years ago

Using Roberta hyper-parameter to fine-tune the final checkpoint can achieve:

MNLI QNLI QQP RTE SST-2 MRPC CoLA STS-B
88.5 93.4 91.9 85.8 95.5 91.5 65.0 91.1