Open mullachv opened 7 years ago
| epoch 14 | 2000/ 2323 batches | lr 1.00 | ms/batch 132.84 | loss 3.93 | ppl 51.05
| epoch 14 | 2200/ 2323 batches | lr 1.00 | ms/batch 130.91 | loss 3.87 | ppl 47.75
-----------------------------------------------------------------------------------------
| end of epoch 14 | time: 326.76s | valid loss 4.84 | valid ppl 126.41
-----------------------------------------------------------------------------------------
| epoch 15 | 200/ 2323 batches | lr 0.00 | ms/batch 133.49 | loss 4.12 | ppl 61.81
| epoch 15 | 400/ 2323 batches | lr 0.00 | ms/batch 131.63 | loss 4.27 | ppl 71.50
-----------------------------------------------------------------------------------------
| end of epoch 15 | time: 315.69s | valid loss 4.84 | valid ppl 126.41
-----------------------------------------------------------------------------------------
| epoch 16 | 200/ 2323 batches | lr 0.00 | ms/batch 131.21 | loss 4.12 | ppl 61.81
| epoch 16 | 400/ 2323 batches | lr 0.00 | ms/batch 134.99 | loss 4.27 | ppl 71.50
Good catch, can you submit a PR to the master branch?
Jake, line 161 should be dividing by 4.0 and not 4 (float vs int): https://github.com/jakezhaojb/DSGA-1008-Spring2017-A2/blob/master/main.py#L161
Otherwise, the learning rate will go down to zero and stay there (integer division)