chao1224 / MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
https://chao1224.github.io/MoleculeSTM
Other
199 stars 19 forks source link

AttributeError: 'NoneType' object has no attribute 'GetProp' #30

Open martinjingyu opened 1 month ago

martinjingyu commented 1 month ago

Hi, when I am running the code : python pretrain.py \ --verbose --batch_size=8 \ --molecule_type=Graph There is an attributeError, it seems that there is some error in the data, but I downloaded it from your huggingface.

arguments Namespace(CL_neg_samples=1, JK='last', SSL_emb_dim=256, SSL_loss='EBM_NCE', T=0.1, batch_size=8, dataset='PubChemSTM', dataspace_path='../data', decay=0, device=0, dropout_ratio=0.5, epochs=100, gnn_emb_dim=300, gnn_type='gin', graph_pooling='mean', max_seq_len=512, megamolbart_input_dir='../data/pretrained_MegaMolBART/checkpoints', mol_lr=1e-05, mol_lr_scale=1, molecule_type='Graph', normalize=True, num_layer=5, num_workers=8, output_model_dir=None, pretrain_gnn_mode='GraphMVP_G', representation_frozen=False, seed=42, text_lr=0.0001, text_lr_scale=1, text_type='SciBERT', verbose=True, vocab_path='../MoleculeSTM/bart_vocab.txt') Processing... 3%|███████▉ | 8689/250952 [00:06<03:10, 1272.46it/s] Traceback (most recent call last): File "pretrain.py", line 278, in dataset = PubChemSTM_Datasets_Graph(dataset_root) File "/home/wcc/anaconda3/envs/MoleculeSTM/lib/python3.7/site-packages/MoleculeSTM-0.0.0-py3.7.egg/MoleculeSTM/datasets/PubChemSTM.py", line 118, in init File "/home/wcc/anaconda3/envs/MoleculeSTM/lib/python3.7/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in init super().init(root, transform, pre_transform, pre_filter, log) File "/home/wcc/anaconda3/envs/MoleculeSTM/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 97, in init self._process() File "/home/wcc/anaconda3/envs/MoleculeSTM/lib/python3.7/site-packages/torch_geometric/data/dataset.py", line 230, in _process self.process() File "/home/wcc/anaconda3/envs/MoleculeSTM/lib/python3.7/site-packages/MoleculeSTM-0.0.0-py3.7.egg/MoleculeSTM/datasets/PubChemSTM.py", line 132, in process AttributeError: 'NoneType' object has no attribute 'GetProp'

image

chao1224 commented 4 weeks ago

Hi @martinjingyu, This seems that you are loading None from the SDF file. Can you double-check the mol in this line? If there are None molecules, then you can check the preprocessing steps under this folder.