size mismatch for model.mapping

DeepGraphLearning / ProtST

[ICML-23 ORAL] ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts

Apache License 2.0

87 stars 7 forks source link

size mismatch for model.mapping #5

Open CryoSky opened 1 year ago

CryoSky commented 1 year ago

Hello again,

I tried to run the protein function annotation task after downloaded pkl file. However, I got the following error:

RuntimeError: Error(s) in loading state_dict for MultipleBinaryClassification: size mismatch for model.mapping: copying a param with shape torch.Size([20]) from checkpoint, the shape in current model is torch.Size([33]).

[sj4@gn10 ProtST]$ python ./script/run_downstream.py --config ./config/downstream_task/PretrainESM2/annotation_tune.yaml --checkpoint /work/sj4/protst_esm2.pth --dataset GeneOntology --branch BP

The yaml is file is almost the same as one on GitHub except I changed the number of GPUS to gpus: [0]. Could you please check this issue?

Best wishes,

KatarinaYuan commented 1 year ago

Hi, I tried to run the experiment with

python ./script/run_downstream.py --config ./config/downstream_task/PretrainESM2/annotation_tune.yaml --checkpoint ~/scratch/esm-model-weights/esm2_t33_650M_UR50D.pt --dataset GeneOntology --branch BP

after resetting gpus as [0]. It is successful.

Would you give more details about the checkpoint of ESM2 you have?

prsigma commented 1 year ago

Hi, I tried to run the experiment with
python ./script/run_downstream.py --config ./config/downstream_task/PretrainESM2/annotation_tune.yaml --checkpoint ~/scratch/esm-model-weights/esm2_t33_650M_UR50D.pt --dataset GeneOntology --branch BP 
after resetting gpus as [0]. It is successful.

Would you give more details about the checkpoint of ESM2 you have?

I have also encountered a similar issue. May I ask why you used the original model for downstream tasks in your answer? Shouldn't we use a pre-trained model like the protst_esm2.pth mentioned above?

KatarinaYuan commented 1 year ago

Hi, sorry for the misleading comments before. I used the ESM2 checkpoint instead of the ProtST-enhanced one only to test if the released code has any inconsistency with our developing codebase. I'm now downloading the ProtST-enhanced ESM2 checkpoint onto my current working cluster and would come back to you once I tested it.

KatarinaYuan commented 1 year ago

Hi, I downloaded the ProtST-enhanced ESM2 checkpoint (https://protsl.s3.us-east-2.amazonaws.com/checkpoints/protst_esm2.pth, the url link is also stated in README). I have run the following command on 1 GPU, and it's successful.

python ./script/run_downstream.py --config ./config/downstream_task/PretrainESM2/annotation_tune.yaml --checkpoint ~/scratch/protst_output/protst_esm2.pth --dataset GeneOntology --branch BP

This suggests that the uploaded checkpoint and the released code should work fine. Potential problem I can think of is running environment. I currently use torch==1.13.1 and torchdrug==0.2.0. Could you provide more details on the environment versions ?

And if you'd like, you can provide me with the checkpoint you're using right now so that I can use it to reproduce the issue and further detect the reason behind.

KatarinaYuan commented 1 year ago

Hi, we just realized this mismatch is caused by the recent update of TorchDrug. In TorchDrug=0.2.1, the way to construct the variable mapping has been changed and does not seem to be backwards compatible. Please see https://github.com/DeepGraphLearning/torchdrug/commit/c8155f40485ced8ffa81f5eace8792516678e3ec# for details.

The fastest solution would be rolling back to TorchDrug=0.2.0. Sorry for all the troubles!