brightmart / roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese
2.63k stars 409 forks source link

GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE? #78

Open guotong1988 opened 4 years ago

guotong1988 commented 4 years ago

Thank you very much.