sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
78 stars 17 forks source link

FBMN Export: Partial quant table export only? #64

Closed Adafede closed 2 years ago

Adafede commented 2 years ago

Hi!

I am testing a the new features you added within the 5.4.1 pre-release.

Tons of nice things!

I just tried to drag an mzML in the GUI, it gave me n detected features. I then clicked on FBMN Export, and in the obtained quant_table.csv, I have m << nrows. I tried some filters (peak shapes, etc.) to see if I managed to find the rule but I didn't...

So:

Happy to share a test mzML to diagnose if needed! 😄

P.S.: The MGF contains the right number of rows (n).

kaibioinfo commented 2 years ago

Hm, I find two possible reasons for missing features in the quant table:

I think the latter is more likely: to simplify the interoperability with GNPS workflows, we export the name of the feature (which could be the feature ID in an FBMN workflow) instead of its ID (which would be the directory name in the SIRIUS project space). However, the name does not have to be unique and it could be possible that two features have the same name. In this case only one of them is exported. I would suggest to add a Warning whenever two features with same name are exported and replace the name with the ID in this case.

kaibioinfo commented 2 years ago

Oh, but a test mzML would be helpful!

Adafede commented 2 years ago

DDA1.mzML.zip

kaibioinfo commented 2 years ago

Thx. It was indeed a naming problem. When importing mzML files without feature alignment, the feature id is set to the scan id of the apex. Of course, multiple compounds can occur in the same scan and, thus, get duplicate feature IDs. Now, the feature ID is an incrementing counter prefixed by the mzml file name. I wonder if this has any impact on the GNPS workflows (is the feature ID required to be an integer??)

Adafede commented 2 years ago

Do you have a test output? Before manipulating the MGF manually also tagging FBMN expert @lfnothias, will probably be quicker. 😜

mfleisch commented 2 years ago

From the commits it looks like @kaibioinfo provided the suggested fix and the changes are available via the latest pre-release build.

Feel free to reopen if it not solve the problem an further discussions are needed.

Adafede commented 2 years ago

Confirm that it was fixed!