NVIDIA / sentiment-discovery

Unsupervised Language Modeling at scale for robust sentiment classification
Other
1.06k stars 202 forks source link

Unable to use standalone transformer language model #64

Open Rexhaif opened 4 years ago

Rexhaif commented 4 years ago

Hi Current Readme comes with links to pretrained language models(transformer and lstm). But these files contains only state dict, without any arguments to create language model instance in torch prior to loading state dict. Also, it seems like finetune_classifier.py and transfer.py doesn't correctly handles such file structure. Example with finetune_classifier.py

root@0b45a2324da5:/workspace/notebooks/sentiment-discovery# python finetune_classifier.py --load transformer.pt --lr 2e-5 --aux-lm-loss --aux-lm-loss-weight .02
emoji import unavailable
configuring data
Creating mlstm
init BinaryClassifier with 4096 features
Traceback (most recent call last):
  File "finetune_classifier.py", line 77, in get_model_and_optim
    model.lm_encoder.load_state_dict(sd)
  File "/workspace/notebooks/sentiment-discovery/model/model.py", line 180, in load_state_dict
    self.encoder.load_state_dict(state_dict['encoder']['encoder'], strict=strict)
KeyError: 'encoder'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "finetune_classifier.py", line 636, in <module>
    main()
  File "finetune_classifier.py", line 422, in main
    model, optim, LR = get_model_and_optim(args, train_data)
  File "finetune_classifier.py", line 84, in get_model_and_optim
    model.lm_encoder.load_state_dict(sd)
  File "/workspace/notebooks/sentiment-discovery/model/model.py", line 180, in load_state_dict
    self.encoder.load_state_dict(state_dict['encoder']['encoder'], strict=strict)
KeyError: 'encoder'

And, with transfer.py:

root@0b45a2324da5:/workspace/notebooks/sentiment-discovery# python transfer.py --load transformer.pt 
emoji import unavailable
configuring data
Creating mlstm
Traceback (most recent call last):
  File "transfer.py", line 71, in get_model
    model.load_state_dict(sd)
  File "/workspace/notebooks/sentiment-discovery/model/model.py", line 180, in load_state_dict
    self.encoder.load_state_dict(state_dict['encoder']['encoder'], strict=strict)
KeyError: 'encoder'

So, the question is: how to create standalone Language Model instance from pretrained(but not fine-tuned) weights?