Closed WangJiexin closed 2 years ago
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!
Hello, I am very insterested in trying to pre-train the langage model from scratch. But I am not sure about the ratio between validpref and trainpref in actual pretraining? For example, BERT, the corpus is very large (wiki+bookcorpus), what is the reasonable ratio between train set and valid set? Or does BERT use cross-validation?