sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
84 stars 20 forks source link

Sirius 5 - MZmine 3 Integration #69

Closed SteffenHeu closed 4 months ago

SteffenHeu commented 2 years ago

Hey all,

as discussed earlier, I'm working on a MZmine integration for importing results from a Sirius Project and Running Sirius from MZmine via the CLI.

Notes (for your changelog):

Questions (more will surely come):

  1. @mfleisch You mentioned that the directories will be zipped. I saw that behaviour once, but when I run it now, i get the same project space i got with sirius 4. I run my test like this:

INFO 14:20:05 - Running with following arguments: [-i, F:\sirius_temp\test\100_ms_aligned_corr.mgf, --output, F:\sirius_temp\test\project5, formula, -d, F:\sirius_temp\test\db, fingerprint, structure, -d, F:\sirius_temp\test\db, write-summaries]

Whats the intended behaviour? Does it have to do with the .compression file? It has this content:

compressionLevels   1
compressionMethod   DEFLATED

Maybe de default was only set once and then altered somehow?

  1. Would you recommend running the CLI as a single command as above or running formula, fingerprint and structure sequentially?
mfleisch commented 2 years ago
  1. A new created project-space should always use the compressed version. However if you modify an existing project-space (in the old format) SIRIUS will stay with the old format. You can convert between formats by copying the project-space cli tool. sirius -i </path/to/old/project> -o </path/to/new/project> project-space If .compression looks like you have shown, this means compression of method-level folders is enabled. But you need not compute some results to notice a difference compared to the non-compressed format because otherwise there are simply no directories that could be compressed.

  2. If you compute everything in one command this allows SIRIUS to run local and remote jobs in parallel. So the overall running time should be lower. I the complex command crashed it can be recomputed on the same project-space (but without importing source files again) and SIRIUS should skip steps where results already exist (unless --recompute is not set). Long story short, we recommend computing everything in on command unless there is not a good reason for not doing it :-)

SteffenHeu commented 2 years ago

Thanks for the reply:

  1. I think i traced the issue: If the Project directory already exists (even if it is totally empty), the files are not zipped. Is there a command to make sirius always zip the project? otherwise the ouput is kind of unpredictable, depending on what the user selects as a folder for the project. However, since i plan to usne the summary files, i don't think it will be an issue as they are not zipped.

  2. that's good to know!

SteffenHeu commented 2 years ago

Hey @mfleisch,

Is it beneficial to remove obvious isotope singals from MSMS spectra?

Eg in cases like this?

grafik

kaibioinfo commented 2 years ago

Is it so obvious? Peaks with 1 Da difference appear quite often and might just belong to a different formula. If an isotope peak is labeled as "H"-difference, it is usually heavily penalized and often not annotated (like in your example graphic).

We want to add an isotope filter, but besides Bruker instruments where this is really a problem, I don't think that isotope peaks effect the annotation rates.

What would be important, though, is that isotope peaks of the precursor are retained in the MS/MS spectrum! What really makes us crazy sometimes is when software remove all peaks behind the precursor (including isotope peaks) but keep isotope peaks within the MS/MS spectrum. Every smarter isotope filter would use the information of the precursor isotope peaks for filtering fragment isotope peaks.

SteffenHeu commented 2 years ago

We want to add an isotope filter, but besides Bruker instruments where this is really a problem, I don't think that isotope peaks effect the annotation rates.

Ok 👍, which procedure would you currently recommend for Bruker instruments? (this data comes from a timsTOF flex) Running an isotope filter with low m/z tolerances or keeping them in anyway? In sirius 4 I saw the QTOF (isotopes) preset, which does not exist anymore. Would exporting the isolation window help?

What would be important, though, is that isotope peaks of the precursor are retained in the MS/MS spectrum! What really makes us crazy sometimes is when software remove all peaks behind the precursor (including isotope peaks) but keep isotope peaks within the MS/MS spectrum. Every smarter isotope filter would use the information of the precursor isotope peaks for filtering fragment isotope peaks.

That's good to know. Currenty the spectra are not preprocessed for the Sirius export (except MS2 merging if the user selects it, and the default MS2 merging in a single ramp for IMS data)