Closed guotong1988 closed 4 years ago
Hi Tong,
Many people use Wikipedia as unlabeled pretraining data; a popular version is WikiText103 from Salesforce. Alternatively, you can pretrain on large supervised datasets like MultiNLI and SocialIQA for smaller datasets. Also, we've largely migrated to jiant (https://github.com/nyu-mll/jiant) and you can find more support there.
Best, Alex
On Wed, May 13, 2020, 04:15 Tong Guo notifications@github.com wrote:
Thank you very much.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nyu-mll/GLUE-baselines/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKDWG7PZP76DNCRKESWF2TRRJJK5ANCNFSM4M7QXSBQ .
Thank you very much.