Open Phil1108 opened 4 years ago
Hi @Phil1108 ,
I didn't use a specific sampling method (so all parts are sampled equally). But I think this could be interesting for future work to e.g. see the effects on downstream tasks :)
@stefan-it Okay thanks. Then I'll give it a try and see how it performs in comparison to your models
Hi, do you sampled each dataset (Wikipedia, Common Crawl, Subtitles etc.) equally during German-BERT Training? OpenAI uses a unequal sampling, which may lead to a better result, as stated in the GPT-3 Paper:
If yes, which paremeters do you used?