marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
95 stars 43 forks source link

Problems with HSE, HSP and ASPP when Martinizing from Charmm36 to M3 #471

Open MikkelDA opened 1 year ago

MikkelDA commented 1 year ago

These problems were encountered in the latest development version of martinize2/vermouth which was downloaded 18/10-2022.

HSE: HSE is not present in the M3 forcefield (since HSD and HSE are represented as single residue type in M3), which means that it cannot be martinized. The following warning is produced: "WARNING - unmapped-atom - These atoms are not covered by a mapping. Either your mappings don't describe all atoms (bad idea), or, there's no mapping available for all residues. ['1002A-HSE89:N', '1004A-HSE89:CA', '1006A-HSE89:CB', '1009A-HSE89:ND1', '1010A-HSE89:CG', '1011A-HSE89:CE1', '1013A-HSE89:NE2', '1015A-HSE89:CD2', '1017A-HSE89:C', '1018A-HSE89:O']" If one uses "-modify" to change the HSE to HSD then the following ValueError is produced (My system is supposed to have 476 residues): "The sequence length does not match the number of residues. The sequence has 476 elements for 477 residues." Simply changing HSD names in the pdb to HSE is also not enough since the following warning is then produced: "WARNING - unknown-input - Could not identify the modifications for residues ['HSD89'], involving atoms ['1014-HE2']" So all the HE2 atoms on HSE residues also have to be removed from the system before the system martinizes without errors and warnings.

A thing i learned after writing the above is that "-mutate" is able to successfully change HSE to HSD.

Would it be possible to fix the -modify command (assuming it does not work as intended). At least to me it was more intuitive to use modify to change one histidine type to another instead of mutate, but if it is working as intended, then maybe the documentation for the commands should be altered to reflect that. Another option could also be to add HSE as a residue type in M3 (with identical parameters to HSD).

HSP: Martinization of HSP works without producing any errors or warnings, but while the residue name for the SC beads becomes HSP in the martinized system, the BB bead's residue name is HIS.

ASPP: Upon martinization of an ASPP residue, the resulting itp file refers to the residue as ASPP as expected, but the pdb file refers to the residue as ASP.

I understand that some might want to maintain a single name for all ASPs in a system, but maybe a flag could be made that allows for renaming of such residues.

fgrunewald commented 1 year ago

Issues relating to HSE and HSP will be fixed with #473.

@pckroon I think the ASPP issue arises because the PDB format currently truncates everything to 3-letter resnames. The resname at least propagates correctly until the PDB writer

pckroon commented 1 year ago

@pckroon I think the ASPP issue arises because the PDB format currently truncates everything to 3-letter resnames. The resname at least propagates correctly until the PDB writer

Correct. According to the PDB file format the resname field is 3 characters wide. We could change our writer to start writing dialect, but I'm not a great fan.

fgrunewald commented 1 year ago

I think we just leave it as "truncated" then. It only emits a warning in the GROMACS grompp step which is OK in my opinion. It's a failure of the AA convention or we could enable people to write the output as gro file not PBD. '-x' at the moment always writes a PDB file

MikkelDA commented 1 year ago

The standard pdb format uses column 18, 19 and 20 as the residue name, but quite a few forcefields also use column 21 for the residue name, and since column 21 is not officially used for anything (for the ATOM, HETATOM and TER record types), maybe it would make sense to add a flag that allows for column 21 to be used as an extension to the residue name.

I think it would at least be nice to have the option to use column 21, without having to manually change it post-martinization.

fgrunewald commented 1 year ago

@MikkelDA we welcome contributions but I think for us at the moment this is not a very high priority. We're actually looking into moving to MMCif which would solve this problem