Closed martinez-zacharya closed 2 years ago
This is unexpected, but I can't reproduce the error. Can you confirm you're on the latest version of esm
code?
Specifically it seems like you're somehow loading the alphabet from ESM-1
with 2 more special tokens
Thank you for the advice! It's working now with the latest version of the esm repo.
Below is the traceback for the error I'm encountering when trying to load the esm1b model with this line of code:
model, alphabet = esm1b_t33_650M_UR50S()
Downloading: "https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt" to /home/zmartine/.cache/torch/hub/checkpoints/esm1b_t33_650M_UR50S.pt Downloading: "https://dl.fbaipublicfiles.com/fair-esm/regression/esm1b_t33_650M_UR50S-contact-regression.pt" to /home/zmartine/.cache/torch/hub/checkpoints/esm1b_t33_650M_UR50S-contact-regression.pt Traceback (most recent call last): File "/central/home/zmartine/DistantHomologyDetection/scripts/utils/finetune.py", line 20, in finetune model, alphabet = esm1b_t33_650M_UR50S() File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 230, in esm1b_t33_650M_UR50S return load_model_and_alphabet_hub("esm1b_t33_650M_UR50S") File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 56, in load_model_and_alphabet_hub return load_model_and_alphabet_core(model_data, regression_data) File "/central/home/zmartine/DistantHomologyDetection/scripts/esm/pretrained.py", line 179, in load_model_and_alphabet_core model.load_state_dict(model_state, strict=regression_data is not None) File "/home/zmartine/miniconda3/envs/RemoteHomologyTransformer/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for ProteinBertModel: size mismatch for embed_tokens.weight: copying a param with shape torch.Size([33, 1280]) from checkpoint, the shape in current model is torch.Size([35, 1280]). size mismatch for lm_head.weight: copying a param with shape torch.Size([33, 1280]) from checkpoint, the shape in current model is torch.Size([35, 1280]). size mismatch for lm_head.bias: copying a param with shape torch.Size([33]) from checkpoint, the shape in current model is torch.Size([35]).
Any help would be great. Thank you!