openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
443 stars 112 forks source link

Adding phosphorylated residues to PDBFixer templates #259

Open sukritsingh opened 1 year ago

sukritsingh commented 1 year ago

After discussing with @jchodera - and building on #218 - Is there any plans to add phosphorylated residues such as pSer and pTyr as templates? For setting up initial structures for OpenMM simulation, it would be awesome if we could have the residues automatically built with the appropriate labels/typing so that we could use it with amber/phosaa14SB.xml as a Force Field.

Other softwares are able to also generate pSer and pTyr into structures, but the atom labels they use don't always match up and often need manual relabeling for use with OpenMM.

peastman commented 1 year ago

That ought to be possible. There are a few changes required. We would need to add PDB files for them to the templates folder. It provides a structure for each residue in a typical conformation. It also would be helpful if we can add them to soft.xml, which is a gently repulsive force field used for energy minimizations. It's designed to keep the system in a reasonably realistic conformation while still allowing atoms to pass through each other so bad clashes can be resolved. The script used to create it is in devtools.

They also would need to be added to proteinResidues.

sukritsingh commented 1 year ago

Awesome! Looking at pdbfixer.py - we'd probably also want to add it to substitutions also right? Or is that not necessary?

I can work on building/obtaining an initial PDB structure placed into templates if you like, unless you already had one handy for pSer and pTyr.

What would the standard three residue codes for these be? SEP and PTR? I see SEP in substitutions already I think?

peastman commented 1 year ago

Oops, sorry for not replying.

What would the standard three residue codes for these be? SEP and PTR?

Here are the definitions of SEP and PTR. Look them over and make sure they're the correct ones.

we'd probably also want to add it to substitutions also right?

Do you mean so it will automatically offer to replace them with SER and TYR? They're already there.

sukritsingh commented 1 year ago

No worries at all!

Here are the definitions of SEP and PTR. Look them over and make sure they're the correct ones.

Hah I had just found out about the Ligand Expo just 2 days ago and was going to attach those exact structures here. Yes, those are the correct structures for phosphoserine and phosphotyrosine.

Do you mean so it will automatically offer to replace them with SER and TYR? They're already there.

Yea I had asked that prior to confirming that SEP and PTR are already there.

Following your instructions, since we have the PDB files, it looks like all that would need to happen is to:

  1. Add SEP and PTR pdb files to templates
  2. Add SEP and PTR to 'proteinResidues'
  3. Use the script in 'devtools' to add them to 'soft.xml' Did I miss any other steps?

Would you like me to do this and submit a PR?

peastman commented 1 year ago

Sounds great! soft.xml was created based on Amber99SB-ILDN. Since I don't think it includes those residues, it's fine to regenerate it based on Amber14. The differences should be minor.

sukritsingh commented 1 year ago

Sounds good! One point of clarification: The default openmm.app.ForceField xml files for amber14 don't have SEP and PTR in their records (those residues are found in OpenMMForceFields, in amber/phosaa14SB.xml), would it be alright for me to modify the soft.xml generator script to pull from openmmforcefields? I'm asking this since it would make PDBFixer dependent on openmmforcefields.

If you think there's any other way to generate the soft.xml file or don't mind adding that dependency then I'll give it a go!

peastman commented 1 year ago

That's fine. It doesn't really create a dependency. That script is just run by us to generate the file that gets checked into the repository. Users don't need openmmforcefields installed.

sukritsingh commented 1 year ago

I realized that I don't even need to import it in the script if we're the only ones running it! I can just create an env that has OpenMMForceFields and it'll be able to pull the appropriate force fields.

That said, I can't get devtools/createSoftForcefield.py to run. Even with zero changes made (just pulling the updated branch), I get the following traceback error when I run python createSoftForcefield.py at the 'harmonicBondForce` stage:

$ python createSoftForcefield.py 
...
...
 </Residues>
 <HarmonicBondForce>
Traceback (most recent call last):
  File "/Users/singhs15/work/src/pdbfixer/devtools/createSoftForcefield.py", line 90, in <module>
    class1 = forcefield._atomTypes[type1][0]
TypeError: '_AtomType' object is not subscriptable

I had some earlier issues as well with the forcefield._atomTypes call earlier but I was able to get past that.

Any advice on what might be causing this?

peastman commented 1 year ago

It looks like the type of that field has changed since the script was written. It used to store a tuple (class, mass, element). It now stores an _AtomType object:

class _AtomType(object):
    """Inner class used to record atom types and associated properties."""
    def __init__(self, name, atomClass, mass, element):
        self.name = name
        self.atomClass = atomClass
        self.mass = mass
        self.element = element

You can change that line to

type = forcefield._atomTypes[atomType]

Then in the rest of the block, prefix atomClass, mass, and element with type. wherever they appear.

sukritsingh commented 1 year ago

Thanks! I managed to figure that out and edited my comment above with an updated stack trace issue (sorry!).

It now seems to get stuck at the harmonicBondForce listing step:

$ python createSoftForcefield.py 
...
...
 </Residues>
 <HarmonicBondForce>
Traceback (most recent call last):
  File "/Users/singhs15/work/src/pdbfixer/devtools/createSoftForcefield.py", line 90, in <module>
    class1 = forcefield._atomTypes[type1][0]
TypeError: '_AtomType' object is not subscriptable

Since the object isn't subscriptable, what would be the best way around this? I imagine it'll come up across each of the forces since I see a similar line listed for HarmonicAngleForce and beyond...

peastman commented 1 year ago

It's the same thing. It used to be a tuple whose first element was the class. Change [0] to .atomClass.

sukritsingh commented 1 year ago

Ohhhh I see! Thanks so much! The object printout looked like a dict to me so I kept getting confused.

Looks like it ran all the way through with no errors! Thanks so much for the help! I'll submit the PR soon.