kiharalab / DOVE

A Deep-learning based dOcking decoy eValuation mEthod
GNU General Public License v3.0
54 stars 11 forks source link

pdb file specification #2

Closed quantitative-technologies closed 4 years ago

quantitative-technologies commented 4 years ago

Apologies if this is not a real issue, but I am new to this field.

I am seeing inconsistent formats between various pdb files, and my pdb files are not working with your software.

In the example decoy which works with your code, the pdb file looks like this:

MODEL        1
ATOM      1  N   ASP A   1      24.234  -0.377  -1.123  1.00 37.11      D    N  
ATOM      2  CA  ASP A   1      25.525  -0.546  -1.775  1.00 34.72      D    C  
ATOM      3  C   ASP A   1      26.183   0.770  -1.427  1.00 29.50      D    C  

On the other hand, when I tried to generate the decoys for the ZDOCK benchmark, as in your paper, they look like:

ATOM      1  N   GLY A   2      25.503   1.260  58.635  5     1 1.63         -0.15
ATOM      2  CA  GLY A   2      25.630   2.325  57.666  5     1 1.99          0.10
ATOM      3  C   GLY A   2      26.653   3.346  58.106  5     1 1.67          0.60

and when I run your code with these pdb's I get the error IndexError: too many indices for array.

When I then load the pdb in pymol and export it, I get:

ATOM      1  N   GLY A   2      25.503   1.260  58.635  5.00  1.00           N  
ATOM      2  CA  GLY A   2      25.630   2.325  57.666  5.00  1.00           C  
ATOM      3  C   GLY A   2      26.653   3.346  58.106  5.00  1.00           C  

which seems to conform to the pdb standard that I looked up, but again it is rejected by your code.

Are there different specifications of the pdb file format, and can you explain which one your software uses? Are there standard tools for switching between formats?

Thank you in advance.

wang3702 commented 4 years ago

Sorry for the late reply! It did not send email to my mailbox. I just follow the pdb format in the Zdock benchmark. I think here I have one example https://github.com/kiharalab/DOVE/tree/master/Example/Decoys. Please use the same format here.

quantitative-technologies commented 4 years ago

Thank you for getting back to me.

Yes, I think I got past the format issue but was still getting this error when I tried to generate the decoys for the ZDOCK benchmark.

I will check again to see where the difficulty occurred and try to get back later this week, with a specific description.

quantitative-technologies commented 4 years ago

Your example decoys have some different columns than the ZDOCK benchmark decoys I generated. However, I made some effort to step through the code and I don't think it is a format issue.

These are the steps I did to create the pdb decoy file:

1) Dowload the benchmark 2) Put the inputs 1A2K_l_u.pdb.ms 1A2K_r_u.pdb.ms with the prediction output 1A2K.zd3.0.2.fg.out in a directory, and ran create.pl 1A2K.zd3.0.2.fg.out. 3) This created 54,000 decoy pdb files. I tried your code on the generated file complex.1.pdb, which gave the following output:

/home/james/projects/docking/decoys_bm4_zd3.0.2_6deg/test3/complex.1888 created waiting dealing1 1 888_goap.pdb -62079.40 -42359.53 -19719.87 1 complex.888.pdb -62079.40 -42359.53 -19719.87 in total, we have 420 residues in receptor, 0 residues in ligand in the interface 10A cut off, we have 0 residue, 0 atoms in the receptor in the interface 10A cut off, we have 0 residue, 0 atoms in the ligand Traceback (most recent call last): File "main.py", line 73, in record,input_path=run_pred_single(file_path,random_id,gpu_id) File "main.py", line 18, in run_pred_single input_path=Prepare_Input_Singe(file_path, random_id) File "/home/james/projects/docking/DOVE/Input_Preparing/Prepare_Input_Single.py", line 45, in Prepare_Input_Singe tempload, rlength, llength = reform_input(rlist, llist, 1) File "/home/james/projects/docking/DOVE/data_processing/prepare_input.py", line 286, in reform_input xmean=np.mean(coordinate[:,0]) IndexError: too many indices for array

Can you please tell me what I did wrong? Here is the generated ZDOCK benchmark file complex.1.pdb.

wang3702 commented 4 years ago

I have described in README. For input pdb, please give chain "A" to receptor part and chain "B" to ligand part. Then DOVE can distinguish the receptor and ligand and score the complex structure.

wang3702 commented 4 years ago

If no further questions, I will close the issue.

quantitative-technologies commented 4 years ago

Yes, I tried setting the chain id's accordingly. It seems to work now. Thanks.