Electrostatics / pdb2pqr

PDB2PQR - determining titration states, adding missing atoms, and assigning charges/radii to biomolecules.
http://www.poissonboltzmann.org/
Other
117 stars 34 forks source link

naming inconsistency with CHARMM #358

Open pokhreln opened 1 year ago

pokhreln commented 1 year ago

I am trying to get the pka for 1tiv.pdb and write the output using the CHARMM naming scheme. It appears that the residues names are inconsistently named. Namely, pdb2pqr 1tiv.pdb 1tiv.pqr --ffout=CHARMM --titration-state-method=propka --with-ph=3 produces a 1tiv.pqr file that name GLU as GLUP. Can you please help?

Thanks, Nihit

sobolevnrm commented 1 year ago

Running pdb2pqr 1tiv.pdb 1tiv.pqr --ffout=CHARMM --titration-state-method=propka --with-ph=2 generates protonated versions of ASP and GLU with inconsistent residue names for atoms within a single residue. For example:

ATOM     20  N   ASP     2      -9.450   4.065   3.037 -0.4000 1.5000
ATOM     21  CA  ASP     2      -9.506   5.204   4.058 -0.0000 2.0000
ATOM     22  C   ASP     2      -8.836   4.814   5.409  0.5500 1.7000
ATOM     23  O   ASP     2      -7.935   5.518   5.829 -0.5500 1.4000
ATOM     24  CB ASPP     2     -10.936   5.748   4.325  0.0000 2.0000
ATOM     25  CG ASPP     2     -10.862   6.973   5.243  0.5500 1.7000
ATOM     26  OD2ASPP     2     -10.646   8.053   4.737 -0.4900 1.4000
ATOM     27  OD1ASPP     2     -11.012   6.802   6.442 -0.4950 1.4000
ATOM     28  HN  ASP     2      -8.535   3.765   2.621  0.4000 1.0000
ATOM     29  HA  ASP     2      -8.969   5.952   3.664  0.0000 0.0000
ATOM     30  HB2ASPP     2     -11.320   6.017   3.467  0.0000 0.0000
ATOM     31  HB1 ASP     2     -11.446   5.046   4.773  0.0000 0.0000
ATOM     32  HD2ASPP     2     -10.501   8.135   3.729  0.4350 1.0000

and

ATOM    123  N   GLU     9       4.581   2.477   6.758 -0.4000 1.5000
ATOM    124  CA  GLU     9       5.953   2.181   7.229 -0.0000 2.0000
ATOM    125  C   GLU     9       6.378   0.908   6.472  0.5500 1.7000
ATOM    126  O   GLU     9       7.243   0.988   5.617 -0.5500 1.4000
ATOM    127  CB  GLU     9       6.009   2.004   8.761  0.0000 2.0000
ATOM    128  CG GLUP     9       7.097   2.916   9.346  0.0000 2.0000
ATOM    129  CD GLUP     9       8.476   2.264   9.203  0.5500 1.7000
ATOM    130  OE2GLUP     9       9.081   2.423   8.156 -0.4900 1.4000
ATOM    131  OE1GLUP     9       8.904   1.621  10.146 -0.4950 1.4000
ATOM    132  HN  GLU     9       3.958   1.705   6.379  0.4000 1.0000
ATOM    133  HA  GLU     9       6.552   2.920   6.995  0.0000 0.0000
ATOM    134  HB2 GLU     9       5.162   2.288   9.122  0.0000 0.0000
ATOM    135  HB1 GLU     9       6.260   1.093   8.945  0.0000 0.0000
ATOM    136  HG2GLUP     9       7.108   3.742   8.837  0.0000 0.0000
ATOM    137  HG1 GLU     9       6.923   3.030  10.294  0.0000 0.0000
ATOM    138  HE2GLUP     9      10.011   1.979   8.057  0.4350 1.0000

Additionally, the C-terminal GLU is named GLU rather than GLUP and the GLU sidechain is protonated but the C-terminus is not.

The protonated version of HIS appears to be named correctly.

sobolevnrm commented 1 year ago

PDB2PQR appears to be treating these changes like CHARMM patches (e.g., see https://github.com/openmm/openmmforcefields/issues/22) rather than renaming full residues. This is done in various places in the code by an if statement applied to atom name:

However, I don't think this is the desired behavior; I can't think of a good reason for it. It is likely a bug that came from an over-literal interpretation of the CHARMM forcefield files during initial implementation. I'm going to change it to rename the full residue unless @orbeckst or @speleo3 can think of a reason I should keep it.

Thanks!

sobolevnrm commented 1 year ago

Actually, it looks like this code is never used. The get_charmm_params code is only called from get_params1 which is never called: https://github.com/Electrostatics/pdb2pqr/blob/8db8d949ec31c7544d87417a152597bddf808b6e/pdb2pqr/forcefield.py#L363

sobolevnrm commented 1 year ago

The input parameter file must be incomplete. Some debugging information can be obtained by dumping the residue map immediately after this line: https://github.com/Electrostatics/pdb2pqr/blob/8db8d949ec31c7544d87417a152597bddf808b6e/pdb2pqr/forcefield.py#L197

The GLU-related items from this dump are:

GLU
GLU: N: N:
  Charge: -0.4700
  Radius: 1.8500,HN: HN:
  Charge: 0.3100
  Radius: 0.2245,CA: CA:
  Charge: 0.0700
  Radius: 2.2750,HA: HA:
  Charge: 0.0900
  Radius: 1.3200,CB: CB:
  Charge: -0.1800
  Radius: 2.1750,HB1: HB1:
  Charge: 0.0900
  Radius: 1.3200,HB2: HB2:
  Charge: 0.0900
  Radius: 1.3200,CG: CG:
  Charge: -0.2800
  Radius: 2.1750,HG1: HG1:
  Charge: 0.0900
  Radius: 1.3200,HG2: HG2:
  Charge: 0.0900
  Radius: 1.3200,CD: CD:
  Charge: 0.6200
  Radius: 2.0000,OE1: OE1:
  Charge: -0.7600
  Radius: 1.7000,OE2: OE2:
  Charge: -0.7600
  Radius: 1.7000,C: C:
  Charge: 0.5100
  Radius: 2.0000,O: O:
  Charge: -0.5100
  Radius: 1.7000
GLUP
GLUP: CG: CG:
  Charge: -0.2100
  Radius: 2.1750,HG1: HG1:
  Charge: 0.0900
  Radius: 1.3200,HG2: HG2:
  Charge: 0.0900
  Radius: 1.3200,CD: CD:
  Charge: 0.7500
  Radius: 2.0000,OE1: OE1:
  Charge: -0.5500
  Radius: 1.7000,OE2: OE2:
  Charge: -0.6100
  Radius: 1.7700,HE2: HE2:
  Charge: 0.4400
  Radius: 0.2245

Unfortunately, it looks like GLUP is missing some atoms.

sobolevnrm commented 11 months ago

This will be easier to fix if we don't have to deal with XML-format data files; it might finally be time to replace those.

sobolevnrm commented 6 months ago