kexinhuang12345 / DeepPurpose

A Deep Learning Toolkit for DTI, Drug Property, PPI, DDI, Protein Function Prediction (Bioinformatics)
https://doi.org/10.1093/bioinformatics/btaa1005
BSD 3-Clause "New" or "Revised" License
939 stars 269 forks source link

Where did you save "pretrained models on BindingDB IC50" #73

Closed xuzhang5788 closed 3 years ago

xuzhang5788 commented 3 years ago

When I tried your DEMO "oneliner-3CLpro-finetuning-AID1706.ipynb", I got

FileNotFoundError: [Errno 2] No such file or directory: './save_folder/pretrained_models/DeepPurpose_BindingDB/model_MPNN_CNN/config.pkl'

I couldn't find ./save_folder/. In your readme, you said " [11/20] Added 5 more pretrained models on BindingDB IC50 Units (around 1Million data points)"

Thank you

kexinhuang12345 commented 3 years ago

Hi, this would automatically be created when you typed it. Could you checkout the local folder and see what it looks like?

xuzhang5788 commented 3 years ago

Sorry, I found the folder, but the folder's name should be "pretrained_model" instead of "pretrained_models".

However, I got the followings, even I increased MAX_ATOM and MAX_BOND to 400 and 600. This happened if I use MPNN. I think there are bug in utils.py

Loading customized repurposing dataset... Checking if pretrained directory is valid... Beginning to load the pretrained models... Using pretrained model and making predictions... repurposing... Drug Target Interaction Prediction Mode... in total: 82 drug-target pairs encoding drug... unique drugs: 81

Exception: Please increasing MAX_ATOM and MAX_BOND in line 24,25 utils.py and reinstall it. The current setting is for small molecule.


AssertionError Traceback (most recent call last) ~/projects/DeepPurpose/DeepPurpose/utils.py in smiles2mpnnfeature(smiles) 264 try: --> 265 assert atoms_completion_num >= 0 and bonds_completion_num >= 0 266 except:

AssertionError:

During handling of the above exception, another exception occurred:

Exception Traceback (most recent call last)

in 7 save_dir = './save_folder', 8 pretrained_dir = './save_folder/pretrained_model/DeepPurpose_BindingDB/', ----> 9 agg = 'mean') 10 end = time.time() 11 print('Time lapse:' + str(end - start)) ~/projects/DeepPurpose/DeepPurpose/oneliner.py in repurpose(target, target_name, X_repurpose, drug_names, train_drug, train_target, train_y, save_dir, pretrained_dir, finetune_epochs, finetune_LR, finetune_batch_size, convert_y, subsample_frac, pretrained, split, frac, agg, output_len) 76 os.mkdir(result_folder_path) 77 ---> 78 y_pred = models.repurpose(X_repurpose, target, model, drug_names, target_name, convert_y = convert_y, result_folder = result_folder_path, verbose = False) 79 y_preds_models.append(y_pred) 80 print('Predictions from model ' + str(idx + 1) + ' with drug encoding ' + model_name[0] + ' and target encoding ' + model_name[1] + ' are done...') ~/projects/DeepPurpose/DeepPurpose/DTI.py in repurpose(X_repurpose, target, model, drug_names, target_name, result_folder, convert_y, output_num_max, verbose) 77 with open(fo, 'w') as fout: 78 print('repurposing...') ---> 79 df_data = data_process_repurpose_virtual_screening(X_repurpose, target, model.drug_encoding, model.target_encoding, 'repurposing') 80 y_pred = model.predict(df_data) 81 ~/projects/DeepPurpose/DeepPurpose/utils.py in data_process_repurpose_virtual_screening(X_repurpose, target, drug_encoding, target_encoding, mode) 577 df, _, _ = data_process(X_repurpose, target, drug_encoding = drug_encoding, 578 target_encoding = target_encoding, --> 579 split_method='repurposing_VS') 580 581 return df ~/projects/DeepPurpose/DeepPurpose/utils.py in data_process(X_drug, X_target, y, drug_encoding, target_encoding, split_method, frac, random_seed, sample_frac, mode, X_drug_, X_target_) 498 499 if DTI_flag: --> 500 df_data = encode_drug(df_data, drug_encoding) 501 df_data = encode_protein(df_data, target_encoding) 502 elif DDI_flag: ~/projects/DeepPurpose/DeepPurpose/utils.py in encode_drug(df_data, drug_encoding, column_name, save_column_name) 363 df_data[save_column_name] = [unique_dict[i] for i in df_data[column_name]] 364 elif drug_encoding == 'MPNN': --> 365 unique = pd.Series(df_data[column_name].unique()).apply(smiles2mpnnfeature) 366 unique_dict = dict(zip(df_data[column_name].unique(), unique)) 367 df_data[save_column_name] = [unique_dict[i] for i in df_data[column_name]] ~/miniconda3/envs/DeepPurpose/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds) 3846 else: 3847 values = self.astype(object).values -> 3848 mapped = lib.map_infer(values, f, convert=convert_dtype) 3849 3850 if len(mapped) and isinstance(mapped[0], Series): pandas/_libs/lib.pyx in pandas._libs.lib.map_infer() ~/projects/DeepPurpose/DeepPurpose/utils.py in smiles2mpnnfeature(smiles) 265 assert atoms_completion_num >= 0 and bonds_completion_num >= 0 266 except: --> 267 raise Exception("Please increasing MAX_ATOM and MAX_BOND in line 24,25 utils.py and reinstall it. The current setting is for small molecule. ") 268 269 Exception: Please increasing MAX_ATOM and MAX_BOND in line 24,25 utils.py and reinstall it. The current setting is for small molecule.
futianfan commented 3 years ago

It works for me. I guess maybe you can re-install via 'python setup.py install' and try again?

xuzhang5788 commented 3 years ago

Sorry, I cloned package again and tried it. it still didn't work for me.

xuzhang5788 commented 3 years ago

I saw you changed MAX_BOND = MAX_ATOM * 2, then tried it again. Finally, it works. Thanks

xuzhang5788 commented 3 years ago

I suggest you change folder name from ./save_folder/pretrained_model/ to ./save_folder/pretrained_models/ in order to match your demo files. Anyway, thanks!

kexinhuang12345 commented 3 years ago

sounds good, will make the change right away

xuzhang5788 commented 3 years ago

One more problem. I created a new folder of my own data that is parallel to the deep purpose folder. If I ran it ins this directory, I still got errors (Exception: Please increasing MAX_ATOM and MAX_BOND in line 24,25 utils.py and reinstall it. The current setting is for small molecule.)

But, If I ran it in the deep purpose directory, it is okay. Do I have to run my models from the deep purpose directory?

I found that it happened when I chose MPNN embedding method.

kexinhuang12345 commented 3 years ago

Hi, i think it is because in another folder, you would be using the DeepPurpose version from pip install. While in the DeepPurpose folder, you are using the deeppurpose source code. We just made an update on the pip version of DeepPurpose. The error should be fixed by doing pip install DeepPurpose --upgrade.