google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.84k stars 9.56k forks source link

How many epoch do we need when pretraining bert? #1058

Open melody2cmy opened 4 years ago

melody2cmy commented 4 years ago

I want to know how many epoch we need wen pretraining bert,but most of articals about bert just say how many step we need when pretraining? In the artical --'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' gives approximate number--40 epoch.But this is about words,not sentence, but when we pretraing each sample is about sentence. Is epoch not important in Nlp ? In cv ,epoch is important.

maxjust commented 1 year ago

Did you find the answer?