lazear / sage

Proteomics search & quantification so fast that it feels like magic
https://sage-docs.vercel.app
MIT License
210 stars 39 forks source link

Multiple modifications to same residue #48

Closed maxochondria closed 1 year ago

maxochondria commented 1 year ago

Hi Michael,

Is it possible to add several variable modifications to the same site? When I tried including the following modifications in my config.json file:

"variable_mods": {      
      "M": 15.9949,         
      "[": 42.0,            
      "[": 227.98237,
      "[": 331.04570,
      "K": 227.98237,
      "K": 331.04570
    }

only the following modifications seemed to be used:

 "variable_mods": {
      "M": 15.9949,
      "[": 331.0457,
      "K": 331.0457
    }
lazear commented 1 year ago

Hi Maxence,

It's currently only possible to add a single variable modification per amino acid - this is definitely something I have been noodling over the last couple weeks.

The use of the fragment indexing strategy has many benefits (speed, enables open-search), but on the flip side it means that we need to pre-generate every single b- and y- ion fragment for every single peptide (with all combinations of modifications) and store it in memory - this becomes pretty unwieldy when many modifications are used.

That being said, I will probably do some testing and see if it's feasible for use cases like this. In the mean time, I would suggest performing an open search and determining if modifications are present by delta mass

lazear commented 1 year ago

Hi Maxence,

I have added support for this in the newest release of Sage. You can now specify multiple variable mods like so:

"variable_mods": {      
  "M": 15.9949,         
  "[": [42.0, 227.98237, 331.04570],
  "K": [227.98237, 331.04570]
}

Please note that for high frequency amino acids, this may require a large amount of memory or time. I am going to close this issue as completed, but please feel free to re-open it if you have any issues or questions!