smith-chem-wisc / FlashLFQ

Ultra-fast label-free quantification algorithm for mass-spectrometry proteomics
GNU Lesser General Public License v3.0
19 stars 14 forks source link

Error while parsing mzMLs converted from tdf #141

Closed Cajac102 closed 9 months ago

Cajac102 commented 10 months ago

Hey,

When running FlashLFQ with mzMLs that were converted from tdf (with tdf2mzml), I run into errors while reading in the files:

Reading spectra file
Problem opening .mzML file data/dda/mzmls/TP17632AUH_Slot2-29_1_18113.mzML; Invalid URI: The format of the URI could not be determined.

The mzML can be parsed by other tools/parsers, and I don't get that error with other mzMLs converted from raw/mgf. Could you help me out here? I can also send you the file if necessary. I'm using FlashLFQ via docker (v1.2.4, but 1.2.5 and 1.2.6 give similar error messages) on Debian 12.

Cheers, Caro

trishorts commented 10 months ago

can you share the file and I will see what the problem is

Cajac102 commented 10 months ago

Here's the download link (3.2GB): https://drive.google.com/file/d/1ahK0zWZh7X8ABVzxRWmATaPgOJa0I6ho/view?usp=share_link

trishorts commented 9 months ago

ok. i found the problem and I was able to make a fix for the problem, but I don't understand the problem. The problem is the source file location string in two places near the top of the mzml image

if I replace the string with a file path (that doesn't have to be the actual file path), the file will load. image

This would work for a temporary fix. But you need a text editor that can open a huge file like Notepad++

I will still need to dig into this to understand and fix it and then get the update into a new release.

trishorts commented 9 months ago

as an internal note. we should add MS:1002828 and MS:1002817 and their companion strings to SourceFile GetSourceFile()

image

trishorts commented 9 months ago

this is the error we get during file read. image

trishorts commented 9 months ago

I created a new issue at the tdf2mzml github https://github.com/mafreitas/tdf2mzml/issues/23

Cajac102 commented 9 months ago

Thanks for the extensive answer! Adding a simple "/" at the start of the sourceFile location did indeed fix the issue.