sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
90 stars 23 forks source link

Mistakes in fragmentation tree and output spectra labels #29

Closed nirshahaf closed 6 months ago

nirshahaf commented 3 years ago

Hi,

There is a possible bug in the output format of the fragmentation tables of the Sirius output (the parent MF, C42H42O19 is correct): As fragments and heavier isotopic peaks are co-occurring in the data, I would expect them to be labelled according, however in the output table the isotopes are labelled as independent chemical entities, which are wrongly labelled as a +[H] of the monoisotopic mass, see:

image

Additionally, the corresponding fragmentation tree seems mistaken - with the [M+1]+ and [M+2]+ taken as distinct fragments and become parents of subsequent sub-trees:

image

kaibioinfo commented 3 years ago

Woah, this is a mean one: the subtree of [C42H41O18 + 1] looks perfect: several hexose moieties, water losses - sure it gets high score and is picked, although the "OH" loss and the "O" loss are both strange.

By default, there is no special treatment of isotope peaks. You have to enable this option explicitely. In the CLI you can do that with the --IsotopeMs2Settings SCORE option. Instead of SCORE you can also write FILTER (not implemented yet for some reason, I already added a ticket) and IGNORE (default).

Just a few words to the Isotope detection in MS/MS: The idea of the FILTER option is to just remove isotope peaks before starting computation. Its a very simple heuristic that checks for monotonic decreasing peaks with the right mass following high intensive peaks and removes them. It is not enabled by default, because isotope peaks usually only appear for specific instruments (like Bruker) and setups (like DIA). So we do not want to accidentally remove real signal peaks on an Agilent Q-ToF where we do not see isotope peaks in MS/MS anyways. The SCORE option however integrates the isotope peaks into the optimization problem. This makes the optimization problem much harder (because isotope peaks violate the assumption that parent peaks do always have larger masses than their children; this assumption is necessary for many tricks to speed-up the computation). Thus, expect that computation will take more time when enabling this option. Second problem: Isotope peaks in MS/MS can look very different. To be honest, I have never seen so clean isotope peaks as in your example above. Usually, their intensities are totally disturbed and "strange". This is because the isotope intensity in MS/MS depends totally on the instrument and the isolation window setting. It is really hard to find some general setup or scoring for isotopes that work for all kind of instruments and isolation windows. In DIA, of course, you have perfect isotopes. But so far we had rather bad experiences with DIA methods: they produce WAY less fragments than the DDA and a lot of chimeric peaks from isobar compounds. Thus, I would not recommend to use SIRIUS on DIA, without tuning it on DIA beforehand. Which kind of instrument/setup did you used for the measurement above?

nirshahaf commented 3 years ago

Hi and thanks!

It seems that the '--IsotopeMs2Settings' is still not a valid option in the current version (4.6.1)

Regarding DIA, we are using the "Ramp" (MS^E) option on the popular and trusty Waters Synapt-G1 instrument. Following earlier discussions we increased the ionization energy values and actually get nice fragmentation patterns for most compounds - at least in injections of chemical standards - such as the one in the example above. On more modern Waters instruments, BTW, we get many more fragments - especially in the lower mass ranges. The isotopic peaks are usually measured well also in samples of biological matrices - however the co-occurrence of unrelated peaks is a curse which is difficult to dispel - I am actually hoping that Sirius can aid by removing non-relevant fragments and leaving a much cleaner spectra. Visually, this strategy seems to be working well (disregarding the MS2 isotopes) but since we have no a-priory knowledge about the fragmentation patterns of most of these chemical standards, nor a reliable external reference, I don't see a good way to systematically estimate the performance of the virtual fragmentation with this data. I assume that the best solution for the correct formula does detect and correctly annotates the most meaningful fragments. meaningful fragments.

nirshahaf commented 3 years ago

Hi, from which version is the '--IsotopeMs2Settings' option implemented?

mfleisch commented 3 years ago

Hey, it is implemented but you need to use the general purpose (auto-generated) config subtool to set this parameter. This config tool allows you to manipulate every parameter available in the "SIRIUS toolbox". Only the most important parameters are presented in a more user friendly manner as parameters in the specific subtools. Otherwise the CLI would be super overloaded for the average use cases.

A command would look like.

./sirius  -i <INPUT> -o <OUTPUT> config --IsotopeMs2Settings=SCORE formula fingerid canopus

To get a list of possible parameters run

./sirius config -h
nirshahaf commented 3 years ago

Hi,

I tried the suggested options ('FILTER' or 'SCORE') - but both resulted in the same wrong tree (V4.6.1 CLI, V4.8.2 GUI):

image

...it also seems that 'FILTER' is not yet implemented (?) I also noticed the 'QTOF isotopes' present - but selecting it did not change the results...

nirshahaf commented 3 years ago

BTW, is there an option to specify 'config' parameters in a persistent manner, e.g., by editing the /.sirius-4.x/custom.config file?

mfleisch commented 3 years ago

BTW, is there an option to specify 'config' parameters in a persistent manner, e.g., by editing the /.sirius-4.x/custom.config file?

Yes, you should be able to achieve exactly this by using the custom.config file.

chrispook commented 2 years ago

Hello, I am trying out SIRIUS on my DIA data as an alternative to MS-FINDER, which can be very slow (weeks) with thousands of features. Can you clarify what you mean below by "tuning" SIRIUS on DIA data?

Second problem: Isotope peaks in MS/MS can look very different. To be honest, I have never seen so clean isotope peaks as in your example above. Usually, their intensities are totally disturbed and "strange". This is because the isotope intensity in MS/MS depends totally on the instrument and the isolation window setting. It is really hard to find some general setup or scoring for isotopes that work for all kind of instruments and isolation windows. In DIA, of course, you have perfect isotopes. But so far we had rather bad experiences with DIA methods: they produce WAY less fragments than the DDA and a lot of chimeric peaks from isobar compounds. Thus, I would not recommend to use SIRIUS on DIA, without tuning it on DIA beforehand. Which kind of instrument/setup did you used for the measurement above?

To clarify: My data comes from MS-DIAL, using CorrDec deconvolution. https://pubs.acs.org/doi/abs/10.1021/acs.analchem.0c01980

I think this provides really great deconvolution for DIA. I have been doing a lot of GC-MS and robust deconvolution algorithms have been available for >30 years (AMDIS). The CorrDec spectra gave me quite satisfactory library matches, aligning well with manual annotations of three different (RTs ~ 13.07, 13.35, 13.46) isomers that I created using empirical MS/MS & NMR data for my study species. However, I've noticed a couple of oddities with my SIRIUS results, besides the lack of iosotopic peak annotation that brought me to this issue page. Firstly, SIRIUS doesn't offer an ammonium adduct as part of it's initial list. This seems odd as they are highly prevalent in any MS that uses ammonium salts as mobile phase modifier. I am also confused by the inclusion of negative and positive adducts in the same list. Secondly, where I know I have ammonium adducts [M + NH4]+, the fragmentation trees never include the [M - NH3]+ fragment when this is highly abundant in the MS/MS spectra. I have lots of spectra to share if it helps but I'm dumping a zip of screenshots here for now.

Thanks. Chris drive-download-20220103T072126Z-001.zip

AharoniLab commented 2 years ago

Hi Chris, referring to "tuning" Sirius to DIA data I would look at two parameters: a) the MS2 mass measurement accuracy which should reflect your instrument's performance; b) the isotopic pattern score multiplier which is in the config file and should reflect your confidence in the accuracy of isotope abundance measurements. Together, these two parameters can help float the true chemical formula annotations towards higher ranks, which later propagates to better DB annotations. To my best knowledge, neither the 'SCORE' nor the 'FILTER' Sirius options work and the MS2 isotopes detected in DIA remain "as-is" - that is at least with all the versions that I've used. *Word of caution: some software packages distort the isotopic pattern abundance measurement during peak integration and you might want to check with Hiroshi Tsugawa regarding how MS-Dial does. HTH, Nir.