Closed szha closed 5 years ago
Here's a cleaned-up corpus from Trinh et. al., consisting of stories https://console.cloud.google.com/storage/browser/commonsense-reasoning/reproduce/stories_corpus?pli=1 https://arxiv.org/abs/1806.02847
Given the effectiveness of large and clean corpora with deep transformer models, shown in BERT and GPT-2, this might be useful to others. Should we offer the automatic download of this corpus in GluonNLP?
+1
Tracked in the above issue.
Here's a cleaned-up corpus from Trinh et. al., consisting of stories https://console.cloud.google.com/storage/browser/commonsense-reasoning/reproduce/stories_corpus?pli=1 https://arxiv.org/abs/1806.02847
Given the effectiveness of large and clean corpora with deep transformer models, shown in BERT and GPT-2, this might be useful to others. Should we offer the automatic download of this corpus in GluonNLP?