Closed echan00 closed 4 years ago
Until learning rate is too small :)
I think you can look at the validation loss. Stop the training manually when the validation loss stops decreasing.
You can also use '--max-epoch' or '--max-update ' to force stop training at a specified epoch/update. https://fairseq.readthedocs.io/en/latest/command_line_tools.html#fairseq-train
Just to be sure, which is validation loss?
epoch 031: 53%|▌| 28980/55116 [2:01:41<1:45:30, 4.13it/s, loss=3.662, nll_loss=1.945, ppl=3.85, wps=13520, ups=4, wpb=3406.657, bsz=107.062, num_updates=1.68238e+06, lr=2.43802e-05, gnorm=3.772, clip=0.000, oom=0.000, loss_scale=2.000, wall=441405, train_wall=416601]
| epoch 031: 53%|▌| 28981/55116 [2:01:42<1:44:39, 4.16it/s, loss=3.662, nll_loss=1.945, ppl=3.85, wps=13520, ups=4, wpb=3406.643, bsz=107.062, num_updates=1.68238e+06, lr=2.43802e-05, gnorm=3.772, clip=0.000, oom=0.000, loss_scale=2.000, wall=441405, train_wall=416601]
| epoch 031: 53%|▌| 29398/55116 [2:03:27<1:55:04, 3.73it/s, loss=3.663, nll_loss=1.946, ppl=3.85, wps=13520, ups=4, wpb=3406.636, bsz=107.059, num_updates=1.6828e+06, lr=2.43772e-05, gnorm=3.772, clip=0.000, oom=0.000, loss_scale=2.000, wall=441510, train_wall=416704]
Sorry for the basic question, but when does translation training end?
These are my parameters: