Closed sahinbatmaz closed 4 years ago
Hi @sahinbatmaz ,
for 2M steps (which are sufficient) it takes 7 days (this is consistent with other BERT models I've trained) :)
Preprocessing took 1 - 2 days (depending on your RAM and CPU core configuration)
Thanks for the answer and the repository :)
Hi, how long does it take to train a BERT base model in the configuration of Cloud TPU v3-8, 4.4B words, 32K vocabulary size and 512 sequence length ?