sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
78 stars 17 forks source link

High standards filter? #167

Open typewritermonkey opened 1 week ago

typewritermonkey commented 1 week ago

Does the filter have really high standards or is most of my data just junk? I'm down from 22,000 features to 142 of 'good' or 'decent' quality

sirius high standards

Also if we do the FBMN export what option should we choose on the GNPS FBMN page that asks what the source of your feature quantification table was? Sirius isn't on the dropdown menu. Or do we need to process in MZmine first to make an MGF, then process in Sirius and export to FBMN from there? sirius fbmn fail

kaibioinfo commented 1 week ago

Could be. It's very unlikely to have 22k compounds in a single experiment. 142 sounds a bit low, but I would expect around 1k good and decent features. But the thresholds are currently not evaluated on many datasets, so maybe we have to adjust them.

Regarding the FBMN export I will ask Ming

typewritermonkey commented 1 week ago

I think I figured out what the issue with the filter was. Almost all of my features are listed as [M+?] adducts, which I assume means that Sirius couldn't tell what the adduct type was, so they were getting filtered out. I get higher adduct annotation rates with MS-DIAL though I don't really know how accurate they are. Usually when I make FBMN IIMN networks with GNPS after MZmine it lists most of the nodes as feature nodes rather than ion identity nodes, which I think means it doesn't know the adduct type of most of my features either. Hard to predict the formula when you can't even be sure what the exact mass of the precursor was. I did SPE to try to reduce the salts before LCMS and I thought that plus the formic acid would mean most adducts would be the simple [M+H]+ type but I'm not sure what the exact percentage would be. Maybe it would be worth just telling the software to assume everything is an [M+H]+ even if means its sometimes wrong idk. Or maybe just use MZMine for adduct assignment before importing .mgf into Sirius but then I cant do the peak quality filter.

kaibioinfo commented 1 week ago

[M+?]+ means the adduct is not absolutely clear. It might be even an adduct detected, but if this assignment has low probability it's better to also consider alternative explanations. In the end the adduct can still figured out in the structure annotation step, so I would not fix them too early.

If you want to enforce [M+H]+ whenever no adduct is known you can select [M+H]+ as 'fallback' adduct.

typewritermonkey commented 1 week ago

Ah makes sense thanks

typewritermonkey commented 1 week ago

I just tried FBMN on GNPS2 with a Sirius export but it failed there as well

image