openai / finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
MIT License
2.15k stars 503 forks source link

have you ever try a bigger corpus #38

Open ZizhenWang opened 5 years ago

ZizhenWang commented 5 years ago

I find BERT uses BookCorpus (800M words) and Wikipedia (2500M words) but GPT only uses BookCorpus, even BERT has complex model structure which may leads to effect representation ability, the difference in evaluation result may also comes from training corpus. Have you ever try a bigger corpus like wikipedia?

The compare result also can imply BERT #task 2 influences.

ZizhenWang commented 5 years ago

@xuwenshen sorry, I don't have this dataset.