ConnorJL / GPT2

An implementation of training for GPT2, supports TPUs
MIT License
1.42k stars 338 forks source link

GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE? #30

Open guotong1988 opened 3 years ago

guotong1988 commented 3 years ago

Thank you very much.