PetrochukM / PyTorch-NLP

Basic Utilities for PyTorch Natural Language Processing (NLP)
https://pytorchnlp.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.21k stars 257 forks source link

Why there is a huge result difference between Awd-Lstm Weight_Drop and this one #93

Closed realiti4 closed 4 years ago

realiti4 commented 4 years ago

Hi, I've done some testing for 20 epochs. While awd-lstm Weight_Drop results scale as expected, there is a huge difference between dropout=0 and 0.1 for torchnlp Weight_Drop. I'm really curious why this is happening as the model become too hard to train even with 0.1 dropout.

after 20 epochs:

wdrop = 0 loss: 53.8 - accuracy: 72.0

wdrop = 0.1 awd-lstm: loss: 58.8 - accuracy: 67.0 torch.nlp: loss: 68.1 - accuracy: 57.2

wdrop = 0.9 awd-lstm: loss: 66.0 - accuracy: 59.0 torch.nlp: loss: 69.9 - accuracy: 55.92

PetrochukM commented 4 years ago

I don't know! Let me know if you figure it out!