stefan-it / turkish-bert

Turkish BERT/DistilBERT, ELECTRA and ConvBERT models
482 stars 42 forks source link

Traning Duration #4

Closed sahinbatmaz closed 4 years ago

sahinbatmaz commented 4 years ago

Hi, how long does it take to train a BERT base model in the configuration of Cloud TPU v3-8, 4.4B words, 32K vocabulary size and 512 sequence length ?

stefan-it commented 4 years ago

Hi @sahinbatmaz ,

for 2M steps (which are sufficient) it takes 7 days (this is consistent with other BERT models I've trained) :)

Preprocessing took 1 - 2 days (depending on your RAM and CPU core configuration)

sahinbatmaz commented 4 years ago

Thanks for the answer and the repository :)