Closed dadaix closed 2 years ago
Are you able to share the PDB for the nucleic acid structure that caused the problem?
I cannot share the molecule I am working on currently. However, I also try to run it on an already published structure (PDB ID 6sx6) and I get the same error.
In case I use --ff=PARSE I get the same error but for DA 5 resiude.
I've seen this occur as well but on AMBER waters. The bug (at least in my case) is that AMBER defines waters as:
HETATM 3415 OW WAT A 809 17.388 20.916 27.945 1.00 29.88 1 O
HETATM 3416 HW WAT A 809 17.570 21.290 28.854 1.00 0.00 1 H
HETATM 3417 HW WAT A 809 18.115 21.249 27.345 1.00 0.00 1 H
See how the two hydrogen atoms have the same name.
pdb2pqr
seems to have an issue when creating the template DefinitionResidue
where if it has two atoms in the definition file with the same name it will overwrite the first atom:
<residue>
<name>WAT</name>
<atom>
<name>O</name>
<altname>OH2</altname>
<altname>OW</altname>
<altname>OH</altname>
<x>3.240</x>
<y>55.829</y>
<z>19.243</z>
<bond>H1</bond>
<bond>H2</bond>
</atom>
<atom>
<name>H1</name>
<altname>HW</altname>
<altname>HH1</altname>
<altname>1H</altname>
<x>2.865</x>
<y>56.756</y>
<z>19.243</z>
<bond>O</bond>
</atom>
<atom>
<name>H2</name>
<altname>HW</altname>
<altname>HH2</altname>
<altname>2H</altname>
<x>4.240</x>
<y>55.829</y>
<z>19.243</z>
<bond>O</bond>
</atom>
</residue>
results in this object
WAT
Atoms:
O: 3.240 55.829 19.243 H1 H2
H1: 2.865 56.756 19.243 O
H2: 4.240 55.829 19.243 O
Dihedrals:
Alternate naming map:
{'OH2': 'O', 'OW': 'O', 'OH': 'O', 'HW': 'H2', 'HH1': 'H1', '1H': 'H1', 'HH2': 'H2', '2H': 'H2'}
Here you can see how "HW" will always get mapped to "H2" while in reality the first HW should be mapped to H1 and the second HW to H2. Then it seems to drop the second hydrogen in the water, resulting in a water residue with an oxygen and a single hydrogen and then the charge does not sum up to 0 anymore as it should and you get the error. Something similar might happen in your nucleic acids.
Thanks, @stefdoerr !
I would like to restart this issue because it appears to remain unresolved. My attempts to run DNA from any source generates the error: WARNING:Tetrahedral hydrogen reconstruction not available for nucleic acids. Some hydrogens may be missing (if so, this is a bug).
This followed by the aforementioned non-integer error. For a simple example, try 1bna. Thanks for your help in advance with getting around this problem.
Thank you very much for providing this example. We are attempting to determine how difficult this will be to fix.
I've looked into this some more and it appears that PDB2PQR lacks the code to rebuild tetrahedral hydrogens for nucleic acids (that is the source of the warning message).
In particular, the Amino
class has this method:
https://github.com/Electrostatics/pdb2pqr/blob/9b71590bd845da0de8d540ee1ffeee714fb4a9e3/pdb2pqr/aa.py#L142
However, the Nucleic
class does not:
https://github.com/Electrostatics/pdb2pqr/blob/9b71590bd845da0de8d540ee1ffeee714fb4a9e3/pdb2pqr/na.py#L12
Therefore, PDB2PQR fails to generate the hydrogens for nucleic acid residues and generates the warning mentioned above in the process: https://github.com/Electrostatics/pdb2pqr/blob/f6aff8c2b7d5d4fd597ea216884a42859449bbc9/pdb2pqr/biomolecule.py#L360
This will be a very difficult bug to fix because the logic to generate the atom positions is complicated. In fact, it does not appear that PDB2PQR has supported this ever (or at least for a very long time) based on a review of the old repository: https://github.com/Electrostatics/apbs-pdb2pqr/blob/d8a4bee4b64df38cb5b856a3fe86e5586fe81cbe/pdb2pqr/pdb2pqr/na.py#L11
In short, this is a bug we should fix at some point but it will take a long time.
Is this something a grad student could research and mock up or do we have an algorithm for generating the atom positions?
I think the templates already exist in the dat
directory; we just need to generate a nucleic acid version of the amino acid code we already have.
Thanks for the update. I hope you are able to make progress soon.
@poonmk Can you please confirm that this issue is fixed in the most recent versions of PDB2PQR?
Thanks for the update. I did a test with 1bna. PDB2PQR no longer aborts now, though it still throws a tsunami of WARNING:Tetrahedral hydrogen reconstruction not available for nucleic acids. Some hydrogens may be missing (if so, this is a bug).
The generated input for APBS runs, and yields results that are reasonable (by casual inspection). Should we be concerned about the warnings?
I think you can ignore the warnings for now. I'll investigate in a separate issue. Thanks!
i have got the same issue with a protein, PDB: 4yki i pre-treated it with pdb4amber and tleap, ran pdb2pqr from the command line and here is what i got:
CRITICAL:Biomolecue charge is non-integer: -32.34
CRITICAL:Giving up.
We can explore this in the new issue you created. Thank you.
when running pdb2pqr30 (version 3.4.0) on my nucleic acid structure I get
CRITICAL:main.py:768:main_driver:Residue DG5 1 charge is non-integer: -0.3079 CRITICAL:main.py:769:main_driver:Giving up (when using --ff=AMBER)
test.log