Closed gayatripanda5 closed 4 years ago
Thanks for your interest in DeeplyTough. Have you checked out https://github.com/BenevolentAI/DeeplyTough/issues/1 ?
Thanks for your reply. I checked this #1 . It is similar to my case , no .npz files were found in this directory.
I executed the command python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'custom' --db_preprocessing 1 --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar
It gave these warnings 11it [00:00, 1106.86it/s] 2020-08-11 13:57:56,381 - root - WARNING - HTMD featurization file not found: /home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough-master/datasets/processed/htmd/custom/ind_pdbs/6I83_clean.npz,corresponding pdb likely could not be parsed 2020-08-11 13:57:56,381 - root - WARNING - HTMD featurization file not found: /home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough-master/datasets/processed/htmd/custom/ind_pdbs/3GC9_clean.npz,corresponding pdb likely could not be parsed ............... Along with this error AssertionError: No HTMD could be found but 11PDB files were given, please call preprocess_once() on the dataset
No .npz files were formed.
Thanks for the details. However, in your screenshot I see no files or directories, i.e. no .pdb files in the ind_pdbs
directory? Could you perhaps verify that the toy custom dataset distributed in this repository (https://github.com/BenevolentAI/DeeplyTough/tree/master/datasets/custom) works at your place? And then, could you follow the structure of this toy repository for your dataset?
Thank for your reply. After running this command on your toy dataset:
python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'custom' --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar
I am getting this same warning and error: 8it [00:00, 608.95it/s] 2020-08-12 17:31:43,328 - root - WARNING - HTMD featurization file not found: /home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough-master/datasets/processed/htmd/custom/1a05B/1a05B.npz,corresponding pdb likely could not be parsed
AssertionError: No HTMD could be found but 8PDB files were given, please call preprocess_once() on the dataset.
Now, coming to my dataset, I kept all my files(.pdb) in this directory (/dataset/custom/ind_pdbs) All _out files were created by by your script, so this is clear that it has processed these files, but then gave this error at last.
Thanks. Maybe I misinterpret your screenshots but it seems to me that the incorporation of your dataset within datasets
directory has somewhat corrupted it. Could you perhaps:
datasets
directory completely and revert it to the state as in this github repository.python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'custom' --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar
. This should really succeed while printing out a lot of outputs, including messages like "Pre-processing xxxx with HTMD...".If it's OK, continue:
datasets/your_dataset
and put your pds as well as modified .csv file there.python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'your_dataset' --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar
. This should print out a lot of outputs, including messages like "Pre-processing xxxx with HTMD...".Thanks a lot. I apologize for bugging you again . I followed what you said for your toy set , it gave few warnings
*** Open Babel Warning in parseAtomRecord
WARNING: Problems reading a PDB file
Problems reading a HETATM or ATOM record.
According to the PDB specification,
columns 77-78 should contain the element symbol of an atom.
but OpenBabel found ' ' (atom 2692)
1 molecule converted
Traceback (most recent call last):
File "/home/iiitd/miniconda3/envs/deeplytough_mgltools/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_receptor4.py", line 10, in
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the multiarray numpy extension module failed. Most likely you are trying to import a failed build of numpy. Here is how to proceed:
git clean -xdf
(removes all files not under version control) and rebuild numpy.If you have already reinstalled and that did not fix the problem, then:
If (1) looks fine, you can open a new issue at https://github.com/numpy/numpy/issues. Please include details on:
Note: this error has many possible causes, so please don't comment on an existing issue about this - open a new one instead.
Original error was: /home/iiitd/.local/lib/python2.7/site-packages/numpy/core/_multiarray_umath.so: undefined symbol: PyUnicodeUCS4_FromObject
Then ended with the same error
2020-08-13 22:03:57,567 - root - WARNING - HTMD featurization file not found: /home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough-master/datasets/processed/htmd/custom/1a9t/1a9t_clean.npz,corresponding pdb likely could not be parsed
Traceback (most recent call last):
File "/home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough-master/deeplytough/scripts/custom_evaluation.py", line 69, in
Thanks, that's very helpful, the problem apparently is that MGLTools crashes. I would suggest two next steps:
1) Could you post here your $PYTHONPATH
and $PATH
, please? I'm a bit suspicious about the path in the stack trace containing a non-conda directory: "/home/iiitd/.local/lib/python2.7/site-packages/numpy/init.py",
2) The problem might be due to a new version of mgltools, which we haven't fixed. Could you perhaps run the following conda commands, then again delete datasets/processed
directory and run the custom_evaluation.py
command?
conda remove --name deeplytough_mgltools --all
conda create -y -n deeplytough_mgltools python=2.7
conda install -y -n deeplytough_mgltools -c bioconda mgltools=1.5.6
Please accept my apologies for this delayed response.
The paths are export PYTHONPATH=$DEEPLYTOUGH/deeplytough:$PYTHONPATH export PATH=$DEEPLYTOUGH/fpocket2/bin:$PATH
I followed the steps suggested by you. Now , everything seems fine.I ran this command "python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'custom' --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar" for your toy dataset and my dataset too. It ran successfully. Big thanks to you.
That's terrific, I'm glad it works now, thanks for reporting the issue! I will fix the version of mgltools in the repository.
Thanks a lot for your help.
On Wed, Aug 19, 2020, 16:22 Martin Simonovsky notifications@github.com wrote:
That's terrific, I'm glad it works now, thanks for reporting the issue! I will fix the version of mgltools in the repository.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BenevolentAI/DeeplyTough/issues/5#issuecomment-676138088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSH3U42LD57XV3BGYRZAKDSBOVH7ANCNFSM4P2V7O6A .
Dear Sir, I am grateful for your help in this matter. I needed your help in understanding what could be a threshold value for the Pocket-Similarity score, above which we could say that two pockets have a better similarity than others. For my set of inputs, I got this result [image: image.png]
Can you assist me to understand these results? I apologize for bothering you so many times, just wanted your help in analyzing the results. Thanks in advance.
Regards Gayatri Panda
On Wed, Aug 19, 2020 at 4:23 PM Gayatri Panda gayatrip@iiitd.ac.in wrote:
Thanks a lot for your help.
On Wed, Aug 19, 2020, 16:22 Martin Simonovsky notifications@github.com wrote:
That's terrific, I'm glad it works now, thanks for reporting the issue! I will fix the version of mgltools in the repository.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BenevolentAI/DeeplyTough/issues/5#issuecomment-676138088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSH3U42LD57XV3BGYRZAKDSBOVH7ANCNFSM4P2V7O6A .
-- Gayatri Panda PhD19206 (Computational Biology)
Hi, unfortunately the image has not been inserted well, can you try to edit your comment? In general, the similarity score allows for comparing whether two pockets are more similar that other pairs (the score is simply higher). Choosing a particular threshold is not well defined, for a larger dataset you would perhaps just plot ROC curve and decide on an operating point as a balance between true and false positive rates.
Thanks a lot for your reply. The image focussing only on the scores is attached below. So, can we say for now , that more negative pocket similarity score means more similar??
On Sat, Sep 5, 2020, 02:35 Martin Simonovsky notifications@github.com wrote:
Hi, unfortunately the image has not been inserted well, can you try to edit your comment? In general, the similarity score allows for comparing whether two pockets are more similar that other pairs (the score is simply higher). Choosing a particular threshold is not well defined, for a larger dataset you would perhaps just plot ROC curve and decide on an operating point as a balance between true and false positive rates.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BenevolentAI/DeeplyTough/issues/5#issuecomment-687384061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSH3U4ZM6GUDILQEIGPPITSEFJBHANCNFSM4P2V7O6A .
I'm still unable to see the image. But scores are defined as negative distances, so the more negative, the less similar.
Sorry to be troublesome. I am grateful to you for your quick and elaborate responses. I couldn't figure out the issue with the image, anyways I am attaching it below. However, I now have some idea of how to analyze the scores. Thanks a ton.
On Sat, Sep 5, 2020 at 3:16 AM Martin Simonovsky notifications@github.com wrote:
I'm still unable to see the image. But scores are defined as negative distances, so the more negative, the less similar.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BenevolentAI/DeeplyTough/issues/5#issuecomment-687405350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSH3U6LE6TF6K4FHJQLBZTSEFNZTANCNFSM4P2V7O6A .
-- Gayatri Panda PhD19206 (Computational Biology)
I was using DeeplyTough for a user-defined data set , I followed the steps mentioned in "Custom Dataset" of your article and 1.Added the path for STRUCTURE_DATA_DIR environment variable in bashrc file. For testing purposes, I took one pair of PDB structures, their pockets in .pdb format and a csv file for their pairing. I kept all of this in datasets/custom directory. 2.Executed "python $DEEPLYTOUGH/deeplytough/scripts/custom_evaluation.py --dataset_subdir 'custom' --db_preprocessing 1 --output_dir $DEEPLYTOUGH/results --device 'cuda:0' --nworkers 4 --net $DEEPLYTOUGH/networks/deeplytough_toughm1_test.pth.tar"
I am getting the following warning and error:
2020-08-11 11:42:54,118 - root - WARNING - HTMD featurization file not found: /home/iiitd/gayatrip/indigen_v2/final_pdbs/DeeplyTough master/datasets/processed/htmd/custom/ind_pdbs/6I83_clean.npz,corresponding pdb likely could not be parsed
AssertionError: No HTMD could be found but 11PDB files were given, please call preprocess_once() on the dataset.
Can you suggest me where am I going wrong and what can I do rectify this error?