EMSL-Computing / CoreMS

CoreMS is a comprehensive mass spectrometry software framework
BSD 2-Clause "Simplified" License
48 stars 25 forks source link

[Feature Request] 13C constraints in molecular formula assignment #7

Closed Kzra closed 3 years ago

Kzra commented 3 years ago

It would be good if you could declare constraints on 13C atoms during formula assignment.

For example I would like to use the following constraints: C4–50,H4–100, O2–40, N0–2, S0–1, 13C0–1

I don’t see a way of declaring 13C limits in the MSParameters class.

corilo commented 3 years ago

Hi Erza,

The molecular formula search algorithm automatically calculates the isotopologues based on the monoisotopic formula and the mass spectrum dynamic range. So there is no need to limit the 13C, 34S etc.

Is there any other reason why you would like to be able to manually enter the heavy isotopes limits?

Em ter., 6 de jul. de 2021 às 06:49, Ezra Kitson @.***> escreveu:

It would be good if you could declare constraints on 13C atoms during formula assignment.

For example I would like to use the following constraints: C4–50,H4–100, O2–40, N0–2, S0–1, 13C0–1

I don’t see a way of declaring 13C limits in the MSParameters class.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EMSL-Computing/CoreMS/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEFQWISF3PALNG43JNHIEA3TWMCXDANCNFSM474T3PLQ .

Kzra commented 3 years ago

Hi Yuri,

That's great. Thanks for clarifying.

I'm trying to use CoreMS to recreate the data analysis pipeline used in Hawkes, Jeffrey A., et al. "An international laboratory comparison of dissolved organic matter composition by high resolution mass spectrometry: Are we getting the same answer?." Limnology and Oceanography: Methods 18.6 (2020): 235-258..

For formula assignment on complex mixture DOM the authors used the following constraints:

"To assign formulas, a theoretical neutral molecule formula list was generated based on the following constraints: C4–50,H4–100, O2–40, N0–2, S0–1, 13C0–1, 150 < m/z < 1000, 0.3 ≤ H/C ≤ 2.2, 0 < O/C ≤ 1.2, KMD ≤0.4 or ≥ 0.9, valence neutral(nitrogen rule), and double bond equivalents minus oxygen(DBE-O) ≤ 10. Beyond CHO containing molecular formulas, heteroatomic or isotopic formulas were allowed to contain one of the following: N1–2, S1, or 13C1. Formulas above m/z 500 were restricted from N2 assignments. ... In negative-ion mode, theoretical formula masses were calculated as deprotonated analytes (M − H)−."

I want to compare the results of the pipeline made using CoreMS with the data generated in the study. This means limiting assignments of 13C. I guess I can do by manually filtering out any assignments with 13 C > 1 after running the molecular search algorithm.

corilo commented 3 years ago

Hi Erza,

Yes, that would be the only way to filter out any assignment above one 13C.

However, I would highly discourage limiting the amount of 13C to one as used in the manuscript you have mentioned above. Doing so will increase the likelihood of having misassignments and increase the possibility of false negatives. In addition, the CoreMS will also automatically include all other possible heavy isotopes assignments, such as 34S, 18O which are also neglected in this manuscript.

I should also caution that the approach used in this manuscript is empirically derived assumptions for the elemental constraints. However, we use a confidence score approach (based on mass accuracy and fine isotopic structure) to decide the best molecular formula match to a certain m/z.

Kzra commented 3 years ago

Thanks for the advice. It is clear there is no need for a 13C constraint so I will close the issue.