marrink-lab / vermouth-martinize

Describe and apply transformation on molecular structures and topologies
Apache License 2.0
89 stars 40 forks source link

Adding lipidated amino acid parameters #393

Closed anjukris closed 2 years ago

anjukris commented 2 years ago

Hello,

I've been trying to get help to resolve this issue from the Martini forum but have had no luck so far. The protein of my interest has three lipidated amino acids (2 palmitoyl and 1 farnesyl cysteine) and I want to create a coarse-grained structure of the same from a charmm36 atomistic structure. For the time being I'm sticking to Martini22 forcefield since the parameters for these residues have been published previously.

As per some solutions I came across I followed these steps:

Step 3 is where I get stuck. File "/home/user/.local/bin/martinize2", line 797, in <module> entry() File "/home/user/.local/bin/martinize2", line 557, in entry known_force_fields) File "/home/user/.local/lib/python3.6/site-packages/vermouth/map_input.py", line 427, in read_mapping_directory new_mappings = read_backmapping_file(infile, force_fields) File "/home/user/.local/lib/python3.6/site-packages/vermouth/map_input.py", line 111, in read_backmapping_file weights, extra, name_to_index) File "/home/user/.local/lib/python3.6/site-packages/vermouth/map_input.py", line 166, in make_mapping_object idx_to = to_name_to_idx[atname_to] KeyError: 'SC1

Honestly, I am not sure if what I am doing is correct and would be grateful if someone could help me!

Truly frustrated Anjali

pckroon commented 2 years ago

Hello hello,

you're definitely on the right track, and your idea is correct :) For one of your lipidated amino acids, could you post the files you made? There's probably an inconsistency somewhere (or duplicated atom names, or ...).

As an alternative approach you could also 1) change the residue name of the lipid parts of your lipidated amino acids in your input file to the appropriate lipid; 2) add those lipids to universal, martini, and the mappings; and 3) add an appropriate Link to martini that describes with which interactions that lipid should be connected to the amino acid. Optional though, we can focus on getting your complete lipidated amino acid blocks to work first.

anjukris commented 2 years ago

@pckroon Thank you for responding.

I have attached the aminoacids.rtp file (from universal.ff), the martini22 parameters for CYP (palmitoyl) and CYF (farnesyl) tails that I added to the martini22/aminoacids.ff file, and the .map files.

I am not sure how to add links so help on that front would be appreciated as well!

DOI for the published cysteine palmitoyl and farnesyl tail parameters: 10.1021/acs.jpcb.7b101

files_lipidated_aa.zip

anjukris commented 2 years ago

@pckroon Thank you for responding.

I have attached the aminoacids.rtp file (from universal.ff), the martini22 parameters for CYP (palmitoyl) and CYF (farnesyl) tails that I added to the martini22/aminoacids.ff file, and the .map files.

I am not sure how to add links so help on that front would be appreciated as well!

DOI for the published cysteine palmitoyl and farnesyl tail parameters: 10.1021/acs.jpcb.7b101

files_lipidated_aa.zip

pckroon commented 2 years ago

Your cyp.map refers to atoms named SC1, SC2, ... at the martini level, but you called them C1, C2, ... in your .ff file. (Also for cyf). Mapping is done by atom names, so you need to make sure the atom names of the atomistic representation (usually from ff universal) match the atomistic atom names in the map, and the atom names of the cg representation match the cg atom names in the mapping.

If you want to go down the Link path, see e.g. https://github.com/marrink-lab/vermouth-martinize/blob/master/vermouth/data/force_fields/martini22/aminoacids.ff#L558 and lower

anjukris commented 2 years ago

@pckroon thanks a ton! I managed to get rid of that error - however I am now faced with this:

INFO - general - Read 1 molecules from PDB file input.pdb
INFO - step - Guessing the bonds.
INFO - general - 1 molecules after guessing bonds
INFO - step - Repairing the graph.
INFO - general - Applying modification NH2-ter to residue P-MET1
INFO - step - Dealing with modifications.
INFO - general - Identified the modifications ['NH2-ter'] on residues ['MET1', 'MET1', 'MET1', 'MET1']
INFO - general - Identified the modifications ['HSD'] on residues ['HIS27']
INFO - general - Identified the modifications ['HSD'] on residues ['HIS94']
INFO - general - Identified the modifications ['HSD'] on residues ['HIS166']
INFO - general - Identified the modifications ['C-ter'] on residues ['CYF186']
INFO - step - Read input.

Traceback (most recent call last): File "/home/user/.local/bin/martinize2", line 797, in <module> entry() File "/home/user/.local/bin/martinize2", line 644, in entry molecule_selector=selectors.is_protein).run_system(system) File "/home/user/.local/lib/python3.6/site-packages/vermouth/dssp/dssp.py", line 548, in run_system raise ValueError('There is no molecule to which ' ValueError: There is no molecule to which to apply the sequence.

pckroon commented 2 years ago

Ahhh. I know this error... There's no easy fix. I'm afraid you'll have to adapt the source code (a little). We filter on residue names to decide if something is a protein (and SS parameters should be applied to it). You'll need to add your residues in the following places: https://github.com/marrink-lab/vermouth-martinize/blob/4261733ffa187cda80ac4f55510d2dc794d781ea/vermouth/selectors.py#L24 https://github.com/marrink-lab/vermouth-martinize/blob/4261733ffa187cda80ac4f55510d2dc794d781ea/vermouth/data/force_fields/martini22/aminoacids.ff#L16 https://github.com/marrink-lab/vermouth-martinize/blob/4261733ffa187cda80ac4f55510d2dc794d781ea/vermouth/data/force_fields/martini22/aminoacids.ff#L17

I'll look for the related open issue to link and bump them: #321

anjukris commented 2 years ago

@pckroon THANK YOU!! That actually worked.

There are a bunch of additional things I wanted to confirm:

pckroon commented 2 years ago

Sorry if I sound naive - I haven't understood what links do! I didn't add any for these lipidated amino acids.

Links govern the interactions /between/ residues/blocks. You can find examples in the martini FF folders (aminoacids.ff, at the bottom). So what you could do is define a block for the lipid, and have a link deal with the interactions keeping the lipid and aa together

To be honest I used the martinize1 version as a reference to map the atoms to Martini beads. I have attached the script. I am a little unsure about how to map atoms into the SC4 and SC5 beads for CYF. If you look at the script (lines 1131 and 1132 under mapping) you will see what I mean. For CYF, the same atom seems to be mapped under 2 beads. Is this right? Or a typo? I got this script off you guys previously as well and these previously had lipidated amino acids parameters.

It's somewhat common to have a single atom contribute to multiple beads in Martini, due to symmetry constraints. Whether this is a good idea or not is up for discussion and out of scope here.

Should I follow the same procedure for adding residues to the elnedyn22 ff as well?

Pretty much. But of course you already have the required parameters of the atomistic structures in the universal ff now. All that's needed is adding the parameters to the elnedyn22 ff and the mappings.

anjukris commented 2 years ago

@pckroon thank you so much for helping me out! It would be great if the future versions of Martinize2 could have Martini3 parameters for lipidated amino acids to be included, as was the case with Martini2 in martinize1. Thanks once again for your guidance!

pckroon commented 2 years ago

Pull request welcome :)