BioinfoMachineLearning / DeepInteract

A geometric deep learning pipeline for predicting protein interface contacts. (ICLR 2022)
https://zenodo.org/record/6671582
GNU General Public License v3.0
63 stars 11 forks source link

[BUG?] List index out of range #2

Closed gabrielepozzati closed 2 years ago

gabrielepozzati commented 2 years ago

When I run the line:

python3 docker/run_docker.py --left_pdb_filepath /storage/DeepInteract/project/test_data/4heq_l_u.pdb --right_pdb_filepath /storage/DeepInteract/project/test_data/4heq_r_u.pdb --input_dataset_dir /storage/DeepInteract/project/datasets/Input --ckpt_name /storage/DeepInteract/project/checkpoints/LitGINI-GeoTran-DilResNet.ckpt --hhsuite_db /storage/databases/Uniclust30/UniRef30_2021_06 --num_gpus 1

I get the following, terminating in a "list index out of range" and no output:

I1026 17:17:24.445479 140490422077248 run_docker.py:59] Mounting /storage/DeepInteract/project/test_data -> /mnt/input_pdbs I1026 17:17:24.445564 140490422077248 run_docker.py:59] Mounting /storage/DeepInteract/project/test_data -> /mnt/input_pdbs I1026 17:17:24.445607 140490422077248 run_docker.py:59] Mounting /storage/DeepInteract/project/datasets/Input -> /mnt/Input I1026 17:17:24.445646 140490422077248 run_docker.py:59] Mounting /storage/DeepInteract/project/checkpoints -> /mnt/checkpoints I1026 17:17:24.445684 140490422077248 run_docker.py:59] Mounting /storage/databases/Uniclust30 -> /mnt/hhsuite_db I1026 17:17:26.138480 140490422077248 run_docker.py:135] DGL backend not selected or invalid. Assuming PyTorch for now. I1026 17:17:26.138590 140490422077248 run_docker.py:135] Using backend: pytorch I1026 17:17:26.141283 140490422077248 run_docker.py:135] I1026 15:17:26.141029 140696250648384 deepinteract_utils.py:1030] Seeding everything with random seed 42 I1026 17:17:26.141357 140490422077248 run_docker.py:135] Global seed set to 42 I1026 17:17:26.177383 140490422077248 run_docker.py:135] I1026 15:17:26.177001 140696250648384 deepinteract_utils.py:587] Making interim data set from raw data I1026 17:17:26.178824 140490422077248 run_docker.py:135] I1026 15:17:26.178652 140696250648384 parse.py:43] 4 requested keys, 4 produced keys, 0 work keys I1026 17:17:26.178916 140490422077248 run_docker.py:135] W1026 15:17:26.178736 140696250648384 complex.py:36] Complex file /mnt/Input/interim/complexes/complexes.dill already exists! I1026 17:17:26.179392 140490422077248 run_docker.py:135] I1026 15:17:26.179221 140696250648384 pair.py:79] 0 requested keys, 0 produced keys, 0 work keys I1026 17:17:26.179549 140490422077248 run_docker.py:135] I1026 15:17:26.179284 140696250648384 deepinteract_utils.py:608] Generating PSAIA features from PDB files in /mnt/Input/interim/parsed I1026 17:17:26.179922 140490422077248 run_docker.py:135] I1026 15:17:26.179797 140696250648384 conservation.py:361] 0 PDB files to process with PSAIA I1026 17:17:26.181284 140490422077248 run_docker.py:135] I1026 15:17:26.179910 140696250648384 parallel.py:46] Processing 1 inputs. I1026 17:17:26.181358 140490422077248 run_docker.py:135] I1026 15:17:26.181147 140696250648384 parallel.py:62] Sequential Mode. I1026 17:17:26.181491 140490422077248 run_docker.py:135] I1026 15:17:26.181194 140696250648384 conservation.py:43] PSAIA'ing /mnt/Input/interim/external_feats/PSAIA/INPUT/pdb_list.fls I1026 17:17:26.199129 140490422077248 run_docker.py:135] I1026 15:17:26.198776 140696250648384 conservation.py:200] For generating protrusion indices, spent 00.02 PSAIA'ing, 00.00 writing, and 00.02 overall. I1026 17:17:26.199361 140490422077248 run_docker.py:135] I1026 15:17:26.198991 140696250648384 deepinteract_utils.py:625] Generating profile HMM features from PDB files in /mnt/Input/interim/parsed I1026 17:17:26.199785 140490422077248 run_docker.py:135] I1026 15:17:26.199542 140696250648384 conservation.py:458] 4 requested keys, 4 produced keys, 0 work filenames I1026 17:17:26.199849 140490422077248 run_docker.py:135] I1026 15:17:26.199590 140696250648384 conservation.py:464] 0 work filenames I1026 17:17:26.199900 140490422077248 run_docker.py:135] I1026 15:17:26.199645 140696250648384 deepinteract_utils.py:640] Starting postprocessing for all unprocessed pairs in /mnt/Input/interim/pairs I1026 17:17:26.199948 140490422077248 run_docker.py:135] I1026 15:17:26.199685 140696250648384 deepinteract_utils.py:647] Looking for all pairs in /mnt/Input/interim/pairs I1026 17:17:26.200107 140490422077248 run_docker.py:135] Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable. Valid options are: pytorch, mxnet, tensorflow (all lowercase) I1026 17:17:26.200161 140490422077248 run_docker.py:135] I1026 15:17:26.199843 140696250648384 deepinteract_utils.py:660] Found 0 work pair(s) in /mnt/Input/interim/pairs I1026 17:17:26.200797 140490422077248 run_docker.py:135] Traceback (most recent call last): I1026 17:17:26.200864 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 326, in I1026 17:17:26.200918 140490422077248 run_docker.py:135] app.run(main) I1026 17:17:26.200968 140490422077248 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I1026 17:17:26.201017 140490422077248 run_docker.py:135] _run_main(main, args) I1026 17:17:26.201066 140490422077248 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I1026 17:17:26.201114 140490422077248 run_docker.py:135] sys.exit(main(argv)) I1026 17:17:26.201161 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 199, in main I1026 17:17:26.201208 140490422077248 run_docker.py:135] input_dataset = InputDataset(left_pdb_filepath=FLAGS.left_pdb_filepath, I1026 17:17:26.201254 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 95, in init I1026 17:17:26.201300 140490422077248 run_docker.py:135] super(InputDataset, self).init(name='InputDataset', I1026 17:17:26.201347 140490422077248 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/dgl/data/dgl_dataset.py", line 94, in init I1026 17:17:26.201393 140490422077248 run_docker.py:135] self._load() I1026 17:17:26.201438 140490422077248 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/dgl/data/dgl_dataset.py", line 179, in _load I1026 17:17:26.201483 140490422077248 run_docker.py:135] self.process() I1026 17:17:26.201529 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 109, in process I1026 17:17:26.201575 140490422077248 run_docker.py:135] left_complex_graph, right_complex_graph = process_pdb_into_graph(self.left_pdb_filepath, I1026 17:17:26.201622 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/utils/deepinteract_utils.py", line 741, in process_pdb_into_graph I1026 17:17:26.201667 140490422077248 run_docker.py:135] input_pair = convert_input_pdb_files_to_pair(left_pdb_filepath, right_pdb_filepath, I1026 17:17:26.201713 140490422077248 run_docker.py:135] File "/app/DeepInteract/project/utils/deepinteract_utils.py", line 725, in convert_input_pdb_files_to_pair I1026 17:17:26.201758 140490422077248 run_docker.py:135] pair_filepath = launch_postprocessing_of_pruned_pairs( I1026 17:17:26.201883 140490422077248 run_docker.py:135] IndexError: list index out of range

amorehead commented 2 years ago

@gabrielepozzati,

Thank you for expressing interest in using our deep learning pipeline! I have inspected the error you are seeing, and I believe I have figured out where the issue (on our end) is. I have just now merged a pull request to address the bug.

The next opportunity you have, would you be able to pull these latest changes down into your local master branch, rebuild your local Docker image using the same README.md commands, and try making your predictions once more?

gabrielepozzati commented 2 years ago

Sure! Still get a list idx out of range but it's on a different line:

I1028 09:44:02.906656 140062676698944 run_docker.py:135] Traceback (most recent call last): I1028 09:44:02.906729 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 326, in I1028 09:44:02.906780 140062676698944 run_docker.py:135] app.run(main) I1028 09:44:02.906829 140062676698944 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 312, in run I1028 09:44:02.906878 140062676698944 run_docker.py:135] _run_main(main, args) I1028 09:44:02.906927 140062676698944 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main I1028 09:44:02.906975 140062676698944 run_docker.py:135] sys.exit(main(argv)) I1028 09:44:02.907022 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 199, in main I1028 09:44:02.907069 140062676698944 run_docker.py:135] input_dataset = InputDataset(left_pdb_filepath=FLAGS.left_pdb_filepath, I1028 09:44:02.907115 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 95, in init I1028 09:44:02.907162 140062676698944 run_docker.py:135] super(InputDataset, self).init(name='InputDataset', I1028 09:44:02.907209 140062676698944 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/dgl/data/dgl_dataset.py", line 94, in init I1028 09:44:02.907257 140062676698944 run_docker.py:135] self._load() I1028 09:44:02.907303 140062676698944 run_docker.py:135] File "/opt/conda/lib/python3.8/site-packages/dgl/data/dgl_dataset.py", line 179, in _load I1028 09:44:02.907349 140062676698944 run_docker.py:135] self.process() I1028 09:44:02.907396 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/lit_model_predict_docker.py", line 109, in process I1028 09:44:02.907441 140062676698944 run_docker.py:135] left_complex_graph, right_complex_graph = process_pdb_into_graph(self.left_pdb_filepath, I1028 09:44:02.907486 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/utils/deepinteract_utils.py", line 755, in process_pdb_into_graph I1028 09:44:02.907531 140062676698944 run_docker.py:135] input_pair = convert_input_pdb_files_to_pair(left_pdb_filepath, right_pdb_filepath, I1028 09:44:02.907577 140062676698944 run_docker.py:135] File "/app/DeepInteract/project/utils/deepinteract_utils.py", line 740, in convert_input_pdb_files_to_pair I1028 09:44:02.907621 140062676698944 run_docker.py:135] pdb_filename = [os.path.join(pruned_pairs_dir, db.get_pdb_code(key)[1:3], key) I1028 09:44:02.907667 140062676698944 run_docker.py:135] IndexError: list index out of range

amorehead commented 2 years ago

@gabrielepozzati,

Thank you for letting me know that the issue is still present. It is odd that I cannot reproduce the exact error you are seeing on my local machine, however, I believe my latest change to the function where the error is generated addresses it.

Since I cannot test my bugfix directly using my own hardware, may I ask for you to pull these changes down into master, re-build the Docker image, and then re-run your predictions? Thank you for your help in debugging this. It's great to know you and others are catching errors/edge-case oversights I was not originally aware of.

amorehead commented 2 years ago

@gabrielepozzati,

I was finally able to reproduce your error. I will begin looking into it to see if I can figure out what is causing it.

amorehead commented 2 years ago

@gabrielepozzati,

After testing my fix with my local hardware, I believe your issue has been addressed. It seems to me that the issue was caused by the input PDB files not containing either a 'l_u' or a 'r_u' substring. My fix should allow users now to input PDBs with or without these substrings. I also fixed another issue where the feature imputation pipeline would not be triggered prior to making a prediction.

If you could test these changes in your local environment and let me know if they work on your end (at your convenience, of course), I would greatly appreciate it. Thank you once again for pointing out this original issue.

gabrielepozzati commented 2 years ago

Now it works, good job! I still have problems using the GPU (pytorch seems to not find it while the nvidia-smi check in the readme works) but guess I will open another issue if I cannot solve it! In the meanwhile, thanks!