syr-cn / SimSGT

[NeurIPS 2023] "Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules"
27 stars 3 forks source link

dta微调报错 #8

Closed EricGu1001 closed 3 months ago

EricGu1001 commented 3 months ago

[2024-05-18 23:51:48] Start Tuning kiba

0%| | 0/3 [00:00<?, ?it/s] 0%| | 0/3 [00:00<?, ?it/s] Training For Running Seed 0 Loading model from checkpoints/GEOM.pth. Traceback (most recent call last): File "./tuning_dta.py", line 510, in main() File "./tuning_dta.py", line 502, in main tuning_dta(args, train_dataset, valid_dataset, File "./tuning_dta.py", line 351, in tuning_dta gnn = load_chem_gnn_model(args) File "./tuning_dta.py", line 231, in load_chem_gnn_model gnn.load_state_dict(model_state_dict, strict=False) File "/home/penghuan/miniconda3/envs/calm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for TokenMAEClf: size mismatch for tokenizer.x_embedding.atom_embedding_list.1.weight: copying a param with shape torch.Size([4, 300]) from checkpoint, the shape in current model is torch.Size([5, 300]). 微调dta过程中报错,是因为没有跑上一步的pretrain GEOM吗?

syr-cn commented 3 months ago

Hello! Thank you for your interest in our work. This error is not related to the pre-training on GEOM. This is caused by a version mismatch of the lib OGB.

Please refer to the allowable_features variable in ogb/ogb/utils/features.py. Here's the link to the code: link.

The current length of allowable_features['possible_chirality_list'] is 5, but in the old versions of OGB, there are only 4 elements in it. This leads to the size mismatch of ogb/ogb/graphproppred/mol_encoder.py/AtomEncoder and the pretrained checkpoint, which is shown in the error message.

You can fix it by reducing the version of OGB or manually changing the variable allowable_features on your machine.