learn_lm.save doesnt save current state

semaraugusto commented 4 years ago

Describe the bug

I was training a language model and as it got to a certain loss I decided it was enough to train my classification model. Decided to learn_lm.save() and learn_lm.save_encoder(). However, the classification model was not performing well. I then decided to check the embeddings again just in case.

I created the learn_lm learner again and did a learn_lm.load(). To check the metrics I ran learn_lm.validate(). The metrics were as if I had not trained the model at all.

Also, I think the learn_one_cycle() method is not saving intermediate states, check how much the loss went up as the cell 20 began to run. This seems to happen many times here.

Provide your installation details from fastai.utils.show_install import * show_install()

=== Software === 
python        : 3.8.0
fastai        : 1.0.60
fastprogress  : 0.2.2
torch         : 1.4.0
nvidia driver : 440.33
torch cuda    : 10.1 / is available
torch cudnn   : 7603 / is enabled

=== Hardware === 
nvidia gpus   : 3
torch devices : 3
  - gpu0      : 12196MB | TITAN Xp
  - gpu1      : 12196MB | TITAN Xp
  - gpu2      : 12194MB | TITAN Xp

=== Environment === 
platform      : Linux-4.4.0-170-generic-x86_64-with-glibc2.17
distro        : #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019
conda env     : base
python        : /home/semar/.pyenv/versions/3.8.0/bin/python3.8
sys.path      : /home/semar/embeddings-leis/notebooks
/home/semar/.pyenv/versions/3.8.0/lib/python38.zip
/home/semar/.pyenv/versions/3.8.0/lib/python3.8
/home/semar/.pyenv/versions/3.8.0/lib/python3.8/lib-dynload

/home/semar/.pyenv/versions/3.8.0/lib/python3.8/site-packages
/home/semar/.pyenv/versions/3.8.0/lib/python3.8/site-packages/pycocotools-2.0-py3.8-linux-x86_64.egg
/home/semar/.pyenv/versions/3.8.0/lib/python3.8/site-packages/IPython/extensions
/home/semar/.ipython

Expected behavior

Screenshots

Additional context

semaraugusto commented 4 years ago

(the return on cell 22 is ok. I moved the file on bash to become lm.pth)

semaraugusto commented 4 years ago

(delete this issue when possible. Might create it again in a more organized notebook and provide data to reproduce)

fastai / fastai

learn_lm.save doesnt save current state #2617