sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
88 stars 23 forks source link

Exported feature quantification table has no data in one of the columns #179

Closed typewritermonkey closed 3 months ago

typewritermonkey commented 3 months ago

Hi

I just ran the FBMN export in Sirius 6.1 based on 2 imported MZML files but it is only showing results for one of them. The other column is zeros all the way down even though there was definitely stuff in there

image

mfleisch commented 3 months ago

Hey, there was a bug where the preprocessing did not work correctly when importing exactly 2 runs. Can you please retry with 6.0.4 and check if the problem persists.

I am closing the issue for now, please reopen if the problem persists with 6.0.4

typewritermonkey commented 3 months ago

Thanks Markus I updated to 6.04

I'm experimenting to find the maximum number of features it can process with Zodiac enabled and 64GB RAM without freezing and crashing. Last night I set it to run with 10,000 features and by this morning it had crashed. Its a bit strange since earlier versions of Sirius could complete without crashing on my old PC which had less RAM, they just took a lot longer.

I like Sirius built in features for filtering that let you easily reduce the number of features to process but to get Sirius to help enhance my GNPS FBMN networks I have to start using an imported .mgf from mzmine. So to reduce the number of starting features I increased the minimum feature height from 5e4 to 1e6 in the mzmine settings, which got me down from 20,000 features to 10,000, but it was still too many.

The last job I ran completed successfully without Zodiac enabled, but out of 10,000 features it only annotated 13 structures with a confidence above 0.9. That made me wonder if I should try turning off the ms2 noise filter in MZMine before exporting the mgf, since I read in an old Sirius manual that it works better with unfiltered MS2 data. That's what I'm trying now and I'm hoping it will increase the confidence of some of the annotations.

My other question is how much difference does Zodiac make to the confidence of the formula and structure predictions? Would I be better off raising my minimum feature height in mzmine again and running say 5000 features with Zodiac enabled or 10,000 features without Zodiac enabled?

[cid:716b0982-ffad-4d24-935c-2d90167ce66b]


From: Markus Fleischauer @.> Sent: Thursday, 15 August 2024 4:21 am To: sirius-ms/sirius @.> Cc: typewritermonkey @.>; Author @.> Subject: Re: [sirius-ms/sirius] Exported feature quantification table has no data in one of the columns (Issue #179)

Hey, there was a bug where the preprocessing did not work correctly when importing exactly 2 runs. Can you please retry with 6.0.4 and check if the problem persists.

I am closing the issue for now, please reopen if the problem persists with 6.0.4

— Reply to this email directly, view it on GitHubhttps://github.com/sirius-ms/sirius/issues/179#issuecomment-2289235937, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A5WH2GA4QOTBAIIDKDD3543ZRN72JAVCNFSM6AAAAABL354UB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGIZTKOJTG4. You are receiving this because you authored the thread.Message ID: @.***>

mfleisch commented 3 months ago

Hey, why do you apply a confidence threshold of 0.9? It is very unlikely to get many annotation with a confidence that high. The confidence score is not a FDR estimation. In our evaluation in the COSMIC paper we found that a confidence score of 0.64 corresponds to approx. FDR 10% on the evaluation data.

We used a confidence score threshold of 0.64, roughly corresponding to FDR 10% (Extended Data Fig. 9).

See, https://doi.org/10.1038/s41587-021-01045-9 for details.

If filtering your features is an issue you could try preprocess the data with SIRIUS (just drag'n'drop the mzml files) -> filter by feature quality -> export for Molecular Networking -> GNPS.

Regarding Zodiac: It is not likely that it directly impacts the confidence scores. I would suggest to analyze without zodiac and if only you see in the results that formula id is an issue then retry with zodiac. Otherwise you should be fine with running sirius formula id only.

typewritermonkey commented 3 months ago

Hi Markus

Hmm ok if I go down to 0.7 that will get a lot more candidates. I guess the FDR depends on the databases you are searching as well. Since I know I'm looking for microbial natural products I just look at hits from the Natural Products Atlas, which should lower the FDR at the cost of missing out on some relevant candidates that aren't in there I think.

Thanks for your help


From: Markus Fleischauer @.> Sent: Friday, 16 August 2024 4:47 am To: sirius-ms/sirius @.> Cc: typewritermonkey @.>; Author @.> Subject: Re: [sirius-ms/sirius] Exported feature quantification table has no data in one of the columns (Issue #179)

Hey, why do you apply a confidence threshold of 0.9? It is very unlikely to get many annotation with a confidence that high. The confidence score is not a FDR estimation. In our evaluation in the COSMIC paper we found that a confidence score of 0.64 corresponds to approx. FDR 10% on the evaluation data.

We used a confidence score threshold of 0.64, roughly corresponding to FDR 10% (Extended Data Fig. 9).

See, https://doi.org/10.1038/s41587-021-01045-9 for details.

If filtering your features is an issue you could try preprocess the data with SIRIUS (just drag'n'drop the mzml files) -> filter by feature quality -> export for Molecular Networking -> GNPS.

Regarding Zodiac: It is not likely that it directly impacts the confidence scores. I would suggest to analyze without zodiac and if only you see in the results that formula id is an issue then retry with zodiac. Otherwise you should be fine with running sirius formula id only.

— Reply to this email directly, view it on GitHubhttps://github.com/sirius-ms/sirius/issues/179#issuecomment-2291702593, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A5WH2GBY2BOD5C7WHL7KDRTZRTLTTAVCNFSM6AAAAABL354UB6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJRG4YDENJZGM. You are receiving this because you authored the thread.