Closed Qmi3 closed 2 years ago
Hi, the file crossdocked_pocket10_name2id.pt
is automatically generated by the code. You should download crossdocked_pocket10.tar.gz
and unzip it. It includes all the pocket-molecule complex structures of the training and testing sets. When you run the code (train or sample) for the first time,it will judge whether the data has been processed (i.e., whether the processed file exists, processed_path
=config.data.dataset.path+'_processed.lmdb', line 25 of utils/datasets/pl.py
). If not, it will read the structure files in crossdocked_pocket10
to generate preprocessed files(crossdocked_pocket10_processed.lmdb
and crossdocked_pocket10_name2id.pt
, in line 26-27 of utils/datasets/pl.py
).
So please make sure you follow the instruction of data preparation. (I guess you downloaded the lmdb file from the cloud.)
Actually, I did download the IMDB file from the cloud hhhh, i solved this problem as you said. But 2 errors happned qaq : could not sanitize molecule endinh on line xx; explicit valence for atom # xx N ,4, is greater than permitted.
Yes, this error may happen because some of the molecules can't be parsed by RDkit. But we used try-except to jump over these cases. So this error does not matter.
Thank you very much for your answers and wish you all the best in your work
Thank you!
I will be really appreciated if you can submit the file