neulab / awesome-align

A neural word aligner based on multilingual BERT
https://arxiv.org/abs/2101.08231
BSD 3-Clause "New" or "Revised" License
325 stars 47 forks source link

Relationship between num_train_epochs and max_steps #18

Closed dmitrytoda closed 3 years ago

dmitrytoda commented 3 years ago

I did training with num_train_epochs=1 and max_steps=20000. It did 1 epoch of 20k steps, all good. Then I did training with num_train_epochs=2 and max_steps=20000. I expected it to do 2 epochs of 20k steps each, but instead it only did 20k steps total.

So if I want to train longer, should I just change max_steps to say 40000? and leave num_train_epochs=1? but what does num_train_epochs do then?

zdou0830 commented 3 years ago

Hi, if num_train_epoch=n and it takes m steps to go through one epoch, the total training step would be min(m*n, max_steps).

If I set num_train_epochs=1 and max_steps=20000, then if m>20k, the program would train the model for 20k steps, otherwise it would train the model for m steps.

dmitrytoda commented 3 years ago

Thank you, that explains it!