facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.26k stars 643 forks source link

Error in loading ESM1b model #213

Closed martinez-zacharya closed 2 years ago

martinez-zacharya commented 2 years ago

Below is the traceback for the error I'm encountering when trying to load the esm1b model with this line of code:

model, alphabet = esm1b_t33_650M_UR50S()

Downloading: "https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt" to /home/zmartine/.cache/torch/hub/checkpoints/esm1b_t33_650M_UR50S.pt Downloading: "https://dl.fbaipublicfiles.com/fair-esm/regression/esm1b_t33_650M_UR50S-contact-regression.pt" to /home/zmartine/.cache/torch/hub/checkpoints/esm1b_t33_650M_UR50S-contact-regression.pt Traceback (most recent call last): File "/central/home/zmartine/DistantHomologyDetection/scripts/utils/finetune.py", line 20, in finetune model, alphabet = esm1b_t33_650M_UR50S() File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 230, in esm1b_t33_650M_UR50S return load_model_and_alphabet_hub("esm1b_t33_650M_UR50S") File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 56, in load_model_and_alphabet_hub return load_model_and_alphabet_core(model_data, regression_data) File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 179, in load_model_and_alphabet_core model.load_state_dict(model_state, strict=regression_data is not None) File "/home/zmartine/miniconda3/envs/RemoteHomologyTransformer/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for ProteinBertModel: size mismatch for embed_tokens.weight: copying a param with shape torch.Size([33, 1280]) from checkpoint, the shape in current model is torch.Size([35, 1280]). size mismatch for lm_head.weight: copying a param with shape torch.Size([33, 1280]) from checkpoint, the shape in current model is torch.Size([35, 1280]). size mismatch for lm_head.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([35]).

Any help would be great. Thank you!

tomsercu commented 2 years ago

This is unexpected, but I can't reproduce the error. Can you confirm you're on the latest version of esm code? Specifically it seems like you're somehow loading the alphabet from ESM-1 with 2 more special tokens

martinez-zacharya commented 2 years ago

Thank you for the advice! It's working now with the latest version of the esm repo.