BenevolentAI / DeeplyTough

DeeplyTough: Learning Structural Comparison of Protein Binding Sites
Other
157 stars 37 forks source link

Custom dataset evaluation error #1

Closed Abdelmonsif closed 4 years ago

Abdelmonsif commented 4 years ago

I tried to run the script for vertex, TOUGH-M1 and prospeccts and it runs perfectly and I could produce the same AUC as mentioned in your paper. Now I tried to run the custom script and I get this warning: No HTMD could be found but {} PDB files were given, please call preprocess_once() on the dataset'.format(len(pdb_list)) AssertionError: No HTMD could be found but 8 PDB files were given, please call preprocess_once() on the dataset

I tried to look to the script and I found the entries were good and the npz path were identified and created but nothing was created there (numpy matrix or the receptor pdbqt).

Can you help me with that??

Thanks

mys007 commented 4 years ago

Hi, thanks for reporting your issue. A couple of questions:

Abdelmonsif commented 4 years ago

Yes, sir. I got HTMD featurization file not fould warnings. I am trying first to run the script on the two pdb files provided at the custom dataset.

2020-01-10 12:00:57,738 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,739 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a9t/1a9t_clean.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,739 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,739 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,739 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,739 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,740 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed 2020-01-10 12:00:57,740 - root - WARNING - HTMD featurization file not found: /home/monsif/deeplytough/datasets/processed/htmd/custom/1a05B/1a05B.npz, corresponding pdb likely could not have been parsed Traceback (most recent call last): File "/home/monsif/deeplytough/src/scripts/custom_evaluation.py", line 53, in main() File "/home/monsif/deeplytough/src/scripts/custom_evaluation.py", line 36, in main entries = matcher.precompute_descriptors(entries) File "/home/monsif/deeplytough/src/matchers/deeply_tough.py", line 46, in precompute_descriptors feats = load_and_precompute_point_feats(self.model, self.args, pdb_list, point_list, self.device, self.nworkers, self.batch_size) File "/home/monsif/deeplytough/src/engine/predictor.py", line 30, in load_and_precompute_point_feats dataset = PointOfInterestVoxelizedDataset(pdb_list, point_list, box_size=args.patch_size) File "/home/monsif/deeplytough/src/engine/datasets.py", line 207, in init super().init(pdb_list, box_size=box_size, augm_rot=False, augm_mirror_prob=0) File "/home/monsif/deeplytough/src/engine/datasets.py", line 42, in init assert len(self.pdb_list) > 0, 'No HTMD could be found but {} PDB files were given, please call preprocess_once() on the dataset'.format(len(pdb_list)) AssertionError: No HTMD could be found but 8 PDB files were given, please call preprocess_once() on the dataset

This is the whole console output

mys007 commented 4 years ago

Thanks for the log, though are you sure there are no npz files? I was expecting some output generated in https://github.com/BenevolentAI/DeeplyTough/blob/500d36a967d5c5ccb283c353ce5605a16083848a/src/misc/utils.py#L113 ...

Abdelmonsif commented 4 years ago

Yes, it did not create npz filles. it just created file under the directory processed called htmd in which it created all the proteins as directories and no files inside it

Abdelmonsif commented 4 years ago

Sir, I think I figured out the issue. Once I created the deeplytough environment, I removed my anaconda path from $PATH. so, the ready datasets are working because they don't need change of environments while Custom datasets need to change environment. Thank you so much for helping. it is working and running now.

Abdelmonsif commented 4 years ago

a new issue appeared now:

2020-01-13 11:00:27,103 - matchers.deeply_tough - WARNING - Pocket not found, skipping: 1fdsA00.EST 2020-01-13 11:00:27,103 - matchers.deeply_tough - WARNING - Pocket not found, skipping: 1ecmA00.TSA

I thought it is path problem, so I added the absolute path to the pairs list but still it gave me:

Traceback (most recent call last): File "/home/monsif/deeplytough/src/scripts/custom_evaluation.py", line 53, in main() File "/home/monsif/deeplytough/src/scripts/custom_evaluation.py", line 36, in main entries = matcher.precompute_descriptors(entries) File "/home/monsif/deeplytough/src/matchers/deeply_tough.py", line 46, in precompute_descriptors feats = load_and_precompute_point_feats(self.model, self.args, pdb_list, point_list, self.device, self.nworkers, self.batch_size) File "/home/monsif/deeplytough/src/engine/predictor.py", line 30, in load_and_precompute_point_feats dataset = PointOfInterestVoxelizedDataset(pdb_list, point_list, box_size=args.patch_size) File "/home/monsif/deeplytough/src/engine/datasets.py", line 207, in init super().init(pdb_list, box_size=box_size, augm_rot=False, augm_mirror_prob=0) File "/home/monsif/deeplytough/src/engine/datasets.py", line 42, in init assert len(self.pdb_list) > 0, 'No HTMD could be found but {} PDB files were given, please call preprocess_once() on the dataset'.format(len(pdb_list)) AssertionError: No HTMD could be found but 0 PDB files were given, please call preprocess_once() on the dataset

mys007 commented 4 years ago

Hi, could you perhaps share the list and pdb files so that I can run it at my place?

Abdelmonsif commented 4 years ago

it is working now after I added the full path for the list. Thank you so much

mys007 commented 4 years ago

Great!