facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.
Other
2.89k stars 498 forks source link

Configuration to load 100 language pretrained MLM weights. #294

Open abdullahkhilji opened 4 years ago

abdullahkhilji commented 4 years ago

I get the following error while reloading 100 Language MLM, tokenize + BPE pretrained model. What configuration have I put wrong. And where can I get a suitable list of configuration so as to use the pretrained model.

INFO - 05/09/20 11:51:26 - 0:00:13 - Reloading model from data/processed/XLM_en/175k/mlm_100_1280.pth ...
Traceback (most recent call last):
  File "train.py", line 327, in <module>
    main(params)
  File "train.py", line 234, in main
    model = build_model(params, data['dico'])
  File "/home/abdullahkhilji/GitHub/XLM (copy)/src/model/__init__.py", line 134, in build_model
    model.load_state_dict(reloaded)
  File "/home/abdullahkhilji/miniconda3/envs/pyt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for TransformerModel:
    Missing key(s) in state_dict: "lang_embeddings.weight". 
    Unexpected key(s) in state_dict: "attentions.12.q_lin.weight", "attentions.12.q_lin.bias", "attentions.12.k_lin.weight", "attentions.12.k_lin.bias", "attentions.12.v_lin.weight", "attentions.12.v_lin.bias", "attentions.12.out_lin.weight", "attentions.12.out_lin.bias", "attentions.13.q_lin.weight", "attentions.13.q_lin.bias", "attentions.13.k_lin.weight", "attentions.13.k_lin.bias", "attentions.13.v_lin.weight", "attentions.13.v_lin.bias", "attentions.13.out_lin.weight", "attentions.13.out_lin.bias", "attentions.14.q_lin.weight", "attentions.14.q_lin.bias", "attentions.14.k_lin.weight", "attentions.14.k_lin.bias", "attentions.14.v_lin.weight", "attentions.14.v_lin.bias", "attentions.14.out_lin.weight", "attentions.14.out_lin.bias", "attentions.15.q_lin.weight", "attentions.15.q_lin.bias", "attentions.15.k_lin.weight", "attentions.15.k_lin.bias", "attentions.15.v_lin.weight", "attentions.15.v_lin.bias", "attentions.15.out_lin.weight", "attentions.15.out_lin.bias", "layer_norm1.12.weight", "layer_norm1.12.bias", "layer_norm1.13.weight", "layer_norm1.13.bias", "layer_norm1.14.weight", "layer_norm1.14.bias", "layer_norm1.15.weight", "layer_norm1.15.bias", "ffns.12.lin1.weight", "ffns.12.lin1.bias", "ffns.12.lin2.weight", "ffns.12.lin2.bias", "ffns.13.lin1.weight", "ffns.13.lin1.bias", "ffns.13.lin2.weight", "ffns.13.lin2.bias", "ffns.14.lin1.weight", "ffns.14.lin1.bias", "ffns.14.lin2.weight", "ffns.14.lin2.bias", "ffns.15.lin1.weight", "ffns.15.lin1.bias", "ffns.15.lin2.weight", "ffns.15.lin2.bias", "layer_norm2.12.weight", "layer_norm2.12.bias", "layer_norm2.13.weight", "layer_norm2.13.bias", "layer_norm2.14.weight", "layer_norm2.14.bias", "layer_norm2.15.weight", "layer_norm2.15.bias". 
him-mah10 commented 4 years ago

@abdullahkhilji did you figure it out? Stuck with the same issue.

abdullahkhilji commented 4 years ago

I then pretrained the weight from scratch instead of using the pretrained model.

alphadl commented 3 years ago

"lang_embeddings.weight" in a['model'].keys() >> False 🤣