could you release the finetune performance with fewer epoch on glue?

microsoft / MPNet

MPNet: Masked and Permuted Pre-training for Language Understanding https://arxiv.org/pdf/2004.09297.pdf

MIT License

286 stars 33 forks source link

Closed RyanHuangNLP closed 4 years ago

RyanHuangNLP commented 4 years ago

could you release the finetune performance with fewer epoch on glue, such as epoch 3 the same as bert or roberta?

StillKeepTry commented 4 years ago

Using Roberta hyper-parameter to fine-tune the final checkpoint can achieve:

MNLI	QNLI	QQP	RTE	SST-2	MRPC	CoLA	STS-B
88.5	93.4	91.9	85.8	95.5	91.5	65.0	91.1