CCMS-UCSD / GNPS_Workflows

Public Workflows at GNPS
https://gnps.ucsd.edu/
Other
52 stars 43 forks source link

[MASST] -Feature request - variable m/z range for analog search #496

Open RaphaelR87 opened 4 years ago

RaphaelR87 commented 4 years ago

Hi everyone,

as MASST is about to become the BLAST analog for mass-spec based metabolomics, I think the analog search is of utmost value. However, so far the analog search is limited to 100 Da max. While that allows for detecting many common biotransformation reactions of metabolites in the human body some very common biotransformations to 'detoxify' cannot not be detected such as conjugates of glucuronic acid and glutathion.

Furthermore, Natural Product Chemists would love to see analogs of their compounds of interest for discovery purposes but also to address envinronmental/ecological/evolutionary questions. The limit of 100 Da limits the usefulness of MASST for peptidic natural products dramatically (others of course too that have aglycon-glycoside pairs) as many amino acids are > 100 Da

Thanks for considering, it is a great tool already

Best, Raphael

lfnothias commented 4 years ago

Ming will probably provide a more comprehensive answer.

An other trick would be change the precursor ion mass value in search, but I don't recommend that ^^

RaphaelR87 commented 4 years ago

Yeah, I was afraid that increasing computing time might have been the issue.

lfnothias commented 4 years ago

Do you know what modifications you are looking for ? or untargeted?

RaphaelR87 commented 4 years ago

These are general suggestions. But in one case I am looking for analogs of a known and GNPS-annotated NRP. So modifications I am looking for amongst others are compounds with additional (non-canonical) AAs

lfnothias commented 4 years ago

So you expect multiple modifications with non-canonical AAs ?

justinjjvanderhooft commented 4 years ago

Some (most) non-canonical amino acids are > 100 Da....

On Wed, 22 Apr 2020, 21:36 lfnothias, notifications@github.com wrote:

So you expect multiple modifications with non-canonical AAs ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/496#issuecomment-617989968, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARV56AFXH67FO5XEFEL7ITRN5BMPANCNFSM4MOLBKSQ .

RaphaelR87 commented 4 years ago

exactly, that was the point I wanted to make (thanks @justinjjvanderhooft). Actually I think that applies also to all canonical amino acids besides glycine, alanine, proline, serine. Following this thought in my opinion it is not just a nice feature to have, but more of a crucial parameter if you want to compare it to -and make MASST as successful as BLAST.

Do not get me wrong MASST is a fantastic idea and it is already great in its current status.

lfnothias commented 4 years ago

Thanks a lot for adding your input @justinjjvanderhooft

@RaphaelR87 , when you wrote "These are general suggestions." Did you actually meant "general modifications" !?

lfnothias commented 4 years ago

I think this is a very good case for the MASST prototype we discussed a while ago with @mwang87 and @XsirdanielX. @RaphaelR87 do you wanna help and test it by providing a list for the modifications you are interested in of: delta m/z expected / molecular formula for difference / name for these possible AA modifications ? Note that we already have simple general modifications (like oxidation etc), so no need to include them.

RaphaelR87 commented 4 years ago

@lfnothias Haha sorry what I wanted to say was that my suggestions are of general nature, meaning everyone could benefit from a m/z range defined by the user no matter if the research question is used for targeted or untargeted metabolomics.

RaphaelR87 commented 4 years ago

sure I can write a list, but that could become very extensive, I will write you an email

justinjjvanderhooft commented 4 years ago

Glycosylatios (162, 146, 132), glucuronidations (178), and some other interesting modifcations are all >100. Typically, when I use analogue search, I put it on 200….

Op 22 apr. 2020, om 22:47 heeft RaphaelR87 notifications@github.com het volgende geschreven:

@lfnothias https://github.com/lfnothias Haha sorry what I wanted to say was that my suggestions are of general nature, meaning everyone could benefit from a m/z range defined by the user no matter if the research question is used for targeted or untargeted metabolomics.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/496#issuecomment-618031097, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARV56FGFQA6BR2TKZ3246DRN5JVZANCNFSM4MOLBKSQ.

lfnothias commented 4 years ago

@RaphaelR87 it is fine, you can provide a first non-extensive version just to test

justinjjvanderhooft commented 4 years ago

@RaphaelR87 - the table in the supporting information of this paper could be good starting point: https://www.nature.com/articles/nchembio.684 https://www.nature.com/articles/nchembio.684

Op 22 apr. 2020, om 22:51 heeft lfnothias notifications@github.com het volgende geschreven:

@RaphaelR87 https://github.com/RaphaelR87 it is fine, you can provide a first non-extensive version just to test

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/496#issuecomment-618033389, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARV56HUBMX7XMCIWSE27ZDRN5KDPANCNFSM4MOLBKSQ.

lfnothias commented 4 years ago

Excellent @justinjjvanderhooft @RaphaelR87 no need for MF delta. Just m/z delta and name is fine.

justinjjvanderhooft commented 4 years ago

Whilst we are at it: https://pubmed.ncbi.nlm.nih.gov/24191063/ - also interesting mz deltas for saccharide moieties!

On Wed, 22 Apr 2020, 23:02 lfnothias, notifications@github.com wrote:

Excellent @justinjjvanderhooft https://github.com/justinjjvanderhooft @RaphaelR87 https://github.com/RaphaelR87 no need for MF delta. Just m/z delta and name is fine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CCMS-UCSD/GNPS_Workflows/issues/496#issuecomment-618038826, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARV56FDBJAMTCE7RR7QGRTRN5LPBANCNFSM4MOLBKSQ .

RaphaelR87 commented 4 years ago

Great, yeah I was hoping that a list must already exists. Probably there will be an updated one from the Dereplicator paper and/or Norine database. I will check.

RaphaelR87 commented 4 years ago

Amino_acids_delta_m_z_Name.txt

RaphaelR87 commented 4 years ago

Norine DB has 543 monomer entries which are listed here: https://bioinfo.cristal.univ-lille.fr/norine/rest/monomers There probably is a way to get only the delta m/z and names out of these. https://bioinfo.cristal.univ-lille.fr/norine/rest/monomers/flat/xml https://bioinfo.cristal.univ-lille.fr/norine/rest/monomers/flat/json

lfnothias commented 4 years ago

Lets ask @alexeigurevich @hoseinmohimani