marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
99 stars 45 forks source link

Bug: The residue order between the input all-atom model and the output coarse-grained model is different. #551

Closed ZhuDaoScience closed 1 year ago

ZhuDaoScience commented 1 year ago

I'm trying to create a coarse-grained model with an all-atom model. When I use the following command for coarse-graining,

martinize2 -f test_at.pdb -o system.top -x test_cg.pdb -dssp dssp -p backbone -ff martini3001 -govs-include -govs-moltype open -cys auto -scfix

the coarse-grained model I get has the different residue order when compared to the atomic model. It may be something wrong with the script processing the order of molecules, in which the script puts some molecules ahead with a mistake.

I have appended the all-atom model (test_at.pdb.txt) in this issue. The version of Vermouth I use is vermouth-0.9.7.dev4.

Hope you can repeat this bug.

All-atom model (part):

ATOM    651  N   GLN B 675     202.359 172.294 187.548  1.00  0.72      B    N
ATOM    652  CA  GLN B 675     202.236 171.680 186.241  1.00  0.72      B    C
ATOM    653  C   GLN B 675     202.005 170.189 186.351  1.00  0.72      B    C
ATOM    654  O   GLN B 675     202.194 169.562 187.390  1.00  0.72      B    O
ATOM    655  CB  GLN B 675     203.478 171.951 185.337  1.00  0.72      B    C
ATOM    656  CG  GLN B 675     203.550 173.421 184.844  1.00  0.72      B    C
ATOM    657  CD  GLN B 675     204.724 173.739 183.915  1.00  0.72      B    C
ATOM    658  NE2 GLN B 675     204.886 172.964 182.821  1.00  0.00      B    N
ATOM    659  OE1 GLN B 675     205.466 174.695 184.141  1.00  0.00      B    O
ATOM    660  N   THR B 676     201.537 169.588 185.239  1.00  0.59      B    N
ATOM    661  CA  THR B 676     201.366 168.150 185.091  1.00  0.59      B    C
ATOM    662  C   THR B 676     202.693 167.413 185.061  1.00  0.59      B    C
ATOM    663  O   THR B 676     203.554 167.696 184.230  1.00  0.59      B    O
ATOM    664  CB  THR B 676     200.606 167.769 183.822  1.00  0.59      B    C
ATOM    665  CG2 THR B 676     200.234 166.277 183.804  1.00  0.59      B    C
ATOM    666  OG1 THR B 676     199.382 168.491 183.741  1.00  0.59      B    O
ATOM    667  N   SER B 689     198.527 163.763 189.297  1.00  0.52      B    N
ATOM    668  CA  SER B 689     199.536 164.052 188.296  1.00  0.52      B    C
ATOM    669  C   SER B 689     199.910 165.517 188.278  1.00  0.52      B    C
ATOM    670  O   SER B 689     200.629 165.923 187.377  1.00  0.52      B    O
ATOM    671  CB  SER B 689     199.092 163.761 186.830  1.00  0.52      B    C
ATOM    672  OG  SER B 689     198.749 162.391 186.630  1.00  0.52      B    O

Coarse-grained model (part):

ATOM    188 BB   GLN B  89     202.204 170.880 186.945  1.00  0.00
ATOM    189 SC1  GLN B  89     204.498 173.423 184.165  1.00  0.00
ATOM    190 BB   THR B  90     202.354 168.225 184.868  1.00  0.00
ATOM    191 SC1  THR B  90     200.005 167.610 183.784  1.00  0.00
ATOM    192 BB   ALA C  91     261.443 223.639 228.313  1.00  0.00
ATOM    193 SC1  ALA C  91     263.579 223.031 228.280  1.00  0.00
pckroon commented 1 year ago

Hellohello! Does martinize2 give any output? Does martinize2 at least create the same number of residues?

ZhuDaoScience commented 1 year ago

Yes, the output is given. And the number of residues is same. The mistake I find is that the order of residues in the output is different from the input atomic model.

ZhuDaoScience commented 1 year ago

Have you repeated this bug?

pckroon commented 1 year ago

Sorry, I meant to ask if martinize2 prints anything to screen (and what). I haven't had time yet to run it myself, hopefully I can find some time tomorrow.

ZhuDaoScience commented 1 year ago

OK. The output of martinize2 is the following lines.

INFO - general - Read 1 molecules from PDB file test_at.pdb
INFO - step - Guessing the bonds.
INFO - general - 3 molecules after guessing bonds
INFO - step - Repairing the graph.
INFO - general - Applying modification C-ter to residue B-VAL620
INFO - general - Applying modification N-ter to residue B-ASN641
INFO - general - Applying modification C-ter to residue B-THR676
INFO - general - Applying modification N-ter to residue C-ALA27
INFO - missing-atom - Missing atom VAL620:OXT
INFO - missing-atom - Missing atom THR676:OXT
INFO - general - Applying modification N-ter to residue B-SER689
INFO - general - Applying modification C-ter to residue B-THR827
INFO - missing-atom - Missing atom THR827:OXT
INFO - general - Applying modification N-ter to residue B-LYS854
INFO - general - Applying modification C-ter to residue B-ASP1146
INFO - step - Dealing with modifications.
INFO - general - Identified the modifications ['N-ter'] on residues ['ALA27', 'ALA27', 'ALA27', 'ALA27']
INFO - general - Identified the modifications ['C-ter'] on residues ['VAL620', 'VAL620', 'VAL620']
INFO - general - Identified the modifications ['N-ter'] on residues ['ASN641', 'ASN641', 'ASN641', 'ASN641']
INFO - general - Identified the modifications ['C-ter'] on residues ['THR676', 'THR676', 'THR676']
INFO - general - Identified the modifications ['N-ter'] on residues ['SER689', 'SER689', 'SER689', 'SER689']
INFO - general - Identified the modifications ['C-ter'] on residues ['THR827', 'THR827', 'THR827']
INFO - general - Identified the modifications ['N-ter'] on residues ['LYS854', 'LYS854', 'LYS854', 'LYS854']
INFO - general - Identified the modifications ['C-ter'] on residues ['ASP1146', 'ASP1146', 'ASP1146']
INFO - step - Read input.
INFO - step - Creating the graph at the target resolution.
INFO - general - Applying modification mapping ('C-ter',)
INFO - general - Applying modification mapping ('N-ter',)
INFO - general - Applying modification mapping ('C-ter',)
INFO - general - Applying modification mapping ('N-ter',)
INFO - general - Applying modification mapping ('N-ter',)
INFO - general - Applying modification mapping ('C-ter',)
INFO - general - Applying modification mapping ('C-ter',)
INFO - general - Applying modification mapping ('N-ter',)
INFO - step - Averaging the coordinates.
INFO - step - Applying the links.
INFO - step - Placing the charge dummies.
INFO - step - Applying position restraints.
INFO - step - Adding includes for Virtual Site Go Martini.
INFO - general - The output topology will require files generated by "create_goVirt.py".
INFO - step - Writing output.
INFO - general - Please cite: Souza, P C T; Alessandri, R; Barnoud, J; Thallmair, S; Faustino, I; Grünewald, F; Patmanidis, I; Abdizadeh, H; Bruininks, B M H; Wassenaar, T A; Kroon, P C; Melcr, J; Nieto, V; Corradi, V; Khan, H M; Domański, J; Javanainen, M; Martinez-Seara, H; Reuter, N; Best, R B; Vattulainen, I; Monticelli, L; Periole, X; Tieleman, D P; de Vries, A H; Marrink, S J;  Nature Methods 2021; 10.1038/s41592-021-01098-3
INFO - general - Happiness is a dry martini and a good woman... or a bad woman. -- George Burns
pckroon commented 1 year ago

Ok, it's awkward. For some reason your chain C (starting with resid 27) gets put in between your chain Bs. My initial suspicion was that the MergeAllMolecules processor was shuffling the order (for some reason). When I run martinize2 -f test_at.pdb -o system.top -x test_cg.pdb -ff martini3001 -elastic -eunit all (and I disable the atomid sorting) the same issues crops up, so there's something fishy with the order of molecules.

Further digging revealed that there are 3 molecules: The last spans resids 854-1146; second resids 689-827; and the first 567-43 (!). So the first molecule contains chain C. DEBUG - general - vermouth.processors.make_bonds - Guessed bond between 7B-ARG567:CD and 4123C-PHE43:O based on distance. is the culprit.

I need to brood on this a bit more before I declare this a real bug. Maybe the real bug is actually in the atom sorting (sort_molecule_atoms.py) for hiding this.

Either way, your atomistic structure is problematic

ZhuDaoScience commented 1 year ago

Thank you for your reply. I find that it works well after the energy of the atomistic model is minimized. And it's true that there is some crash in the sidechains of my model. I think it will be much better if Martinize2 can give some warnings for these crash in the future. Thank you very much.

pckroon commented 1 year ago

Happy to hear things are working by now. We're in the progress of overhauling the whole Go flow (#550). Maybe that will improve matters a bit, but issuing warnings for clashes sounds like an excellent plan.