About the learning rate of RoBerta in the finetuning stage?

ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

https://ieeexplore.ieee.org/document/9599397

Apache License 2.0

9.57k stars 1.38k forks source link

Closed zhengwsh closed 4 years ago

zhengwsh commented 4 years ago

Can you provide the best learning rate for different tasks with Roberta? I can not find this in the technical report.

ymcui commented 4 years ago