sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
89 stars 23 forks source link

Are naturally protonated [M]+ adducts correctly interpreted? #41

Closed nirshahaf closed 4 months ago

nirshahaf commented 3 years ago

Hi,

I'm analyzing the MS spectra of a chemical standard glyco-benzofuran:

http://www.chemspider.com/Chemical-Structure.22370036.html

which seems to get naturally protonated in the MS - and so I defined the 'ionization' field in the input .ms file as [M]+:

compound NP-001626 formula C27H36O11 parentmass 536.22743870979 ionization [M]+

ms1 536.22743870979 381.23291015625 537.230671259965 195.141723632812 538.2336537032 50.3028564453125 539.237769343928 17.7765197753906_

Sirius accepts the [M]+ adduct in the input and output file names, e.g.: "C27H35O11_[M]+.tsv"

However, in the interpreted spectra and in the resulting formula candidates Sirius doesn't considers this setting and defaults to the '[M+H]+' adduct - resulting in a wrong chemical formula with one less proton:

-- | -- | -- | -- | -- | -- 311.129975 | 50.84 | 3.46 | 311.127786 | C19H18O4 | [M + H]+ 312.137327 | 32.89 | 2.24 | 312.135611 | C19H19O4 | [M + H]+ 313.141775 | 65.65 | 4.46 | 313.143436 | C19H20O4 | [M + H]+ 315.157055 | 88.66 | 6.03 | 315.159086 | C19H22O4 | [M + H]+ 327.157783 | 397.81 | 27.04 | 327.159086 | C20H22O4 | [M + H]+ 328.16423 | 93.28 | 6.34 | 328.166911 | C20H23O4 | [M + H]+ 344.163271 | 59.12 | 4.02 | 344.161825 | C20H23O5 | [M + H]+ 345.168496 | 1471.32 | 100 | 345.16965 | C20H24O5 | [M + H]+ 346.174581 | 367.3 | 24.96 | 346.177475 | C20H25O5 | [M + H]+ 506.213676 | 144.16 | 9.8 | 506.214649 | C26H33O10 | [M + H]+ 518.215092 | 96.59 | 6.56 | 518.214649 | C27H33O10 | [M + H]+ 536.227439 | 0 | 0 | 536.225213 | C27H35O11 | [M + H]+

and the true formula candidate, correctly ranked, is now wrongly assigned (true formula is C27H36O11):

rank molecularFormula adduct precursorFormula SiriusScore TreeScore
1 C27H35O11 [M]+ C27H35O11 84.9812761617617 84.9812761617617

BTW, I tries setting the '--ions-considered' to [M]+ only and have used the mos updated version, just in case, however, the results do not change.

P.S. I still have my fragmentation trees infiltrated with high resolution isopes - is the '--IsotopeMs2Settings FILTER' option already implemented?

Thanks!

kaibioinfo commented 3 years ago

Hi,

internally, [M+H]+ and [M]+ are treated equally (i.e., SIRIUS treats all [M]+ ions internally as [M+H]+ ions). The reason is that from the perspective of the mass spectrometry instrument, there is no difference between an ion that was ionized within the ion source and an ion that was ionized before the ion source. Thus, the [M]+ is not more than a flag. Seems like we sometimes forgot to visualize this correctly to the user.

Regarding the "precursorFormula": this is, unfortunately, very unintuitive, too. The main reason for the precursorFormula field is to have a "pointer" to the filename of the corresponding tree files (e.g., the json file of the tree is always [precursorFormula]_[adduct].json). Thats why it is probably not so easy to change. Internally, M+ is treated as M+H+ and, therefore, subtracting the H+ from the formula is correct. The "molecularFormula" field, however, should show the correct formula.

So I would suggest to add two fixes:

I assume that we will revise the implementation of adducts in the future anyways. Most of these rules and implementations were done at a time were we neither had enough example data for the different adduct types nor knowledge about how the different adduct behave in the MS. But this is a lot of work, because it is quite deep in the core of the SIRIUS code and cannot easily be fixed without having to change a lot of code.

mfleisch commented 2 years ago

Linked internal issue https://git.bio.informatik.uni-jena.de/bioinf-mit/ms/sirius_frontend/-/issues/248