Open aykutfirat opened 6 years ago
Hey @aykutfirat,
We've replicated the same issue you're seeing in terms of the initial training performance for ASGD based WT2, in our case using QRNN as it's faster to test. This is as I patched our changes for the Adam based model we used for WT-103, PTBC, and enwik8 over the top of AWD-LSTM-LM but failed to do full testing for regression.
We're hunting down the issue now, initially to fix the standard training and then later to fix the finetune and pointer steps.
It is probably a related issue, so I thought I would report it here.
When running python main.py --batch_size 20 --data data/penn --dropouti 0.4 --dropouth 0.25 --seed 141 --epoch 500 --save PTB.pt
instead of the perplexities 61.2/58.8
I got 70.1 (?!)/58.6
. The last lines of the training log below.
| end of epoch 498 | time: 159.11s | valid loss 4.25 | valid ppl 70.08 | valid bpc 6.131
-----------------------------------------------------------------------------------------
| epoch 499 | 200/ 663 batches | lr 30.00000 | ms/batch 217.91 | loss 3.69 | ppl 39.95 | bpc 5.320
| epoch 499 | 400/ 663 batches | lr 30.00000 | ms/batch 217.03 | loss 3.66 | ppl 38.88 | bpc 5.281
| epoch 499 | 600/ 663 batches | lr 30.00000 | ms/batch 218.92 | loss 3.67 | ppl 39.39 | bpc 5.300
-----------------------------------------------------------------------------------------
| end of epoch 499 | time: 159.08s | valid loss 4.25 | valid ppl 70.08 | valid bpc 6.131
-----------------------------------------------------------------------------------------
| epoch 500 | 200/ 663 batches | lr 30.00000 | ms/batch 216.38 | loss 3.70 | ppl 40.25 | bpc 5.331
| epoch 500 | 400/ 663 batches | lr 30.00000 | ms/batch 216.45 | loss 3.66 | ppl 38.98 | bpc 5.285
| epoch 500 | 600/ 663 batches | lr 30.00000 | ms/batch 220.70 | loss 3.68 | ppl 39.60 | bpc 5.308
-----------------------------------------------------------------------------------------
| end of epoch 500 | time: 158.92s | valid loss 4.25 | valid ppl 70.08 | valid bpc 6.131
-----------------------------------------------------------------------------------------
=========================================================================================
| End of training | test loss 4.07 | test ppl 58.56 | test bpc 5.872
=========================================================================================
@xsway I think you're issue is linked to https://github.com/salesforce/awd-lstm-lm/pull/32 I think everything is working as expected but we're printing the wrong validation loss/perplexity. Could you try patching that change and re-running? I think it should work. I will be running it myself before I merge the changes.
python finetune.py --epochs 750 --data data/wikitext-2 --save WT2.pt --dropouth 0.2 --seed 1882 python pointer.py --save WT2.pt --lambdasm 0.1279 --theta 0.662 --window 3785 --bptt 2000 --data data/wikitext-2
Traceback (most recent call last): File "finetune.py", line 183, in
stored_loss = evaluate(val_data)
File "finetune.py", line 108, in evaluate
model.eval()
Looks like model loading & more needs to be modified.
Also, I no longer get the reported ppls in main. LSTM gets stuck around 80s and QRNN around 90s.