The ratio between train set (trainpref) and valid set (validpref) in pretraining

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT License

30.22k stars 6.38k forks source link

The ratio between train set (trainpref) and valid set (validpref) in pretraining #3936

Closed WangJiexin closed 2 years ago

WangJiexin commented 2 years ago

Hello, I am very insterested in trying to pre-train the langage model from scratch. But I am not sure about the ratio between validpref and trainpref in actual pretraining? For example, BERT, the corpus is very large (wiki+bookcorpus), what is the reasonable ratio between train set and valid set? Or does BERT use cross-validation?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!

stale[bot] commented 2 years ago

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!