smith-chem-wisc / FlashLFQ

Ultra-fast label-free quantification algorithm for mass-spectrometry proteomics
GNU Lesser General Public License v3.0
19 stars 15 forks source link

Error while parsing mzMLs converted from tdf #141

Closed Cajac102 closed 5 months ago

Cajac102 commented 6 months ago

Hey,

When running FlashLFQ with mzMLs that were converted from tdf (with tdf2mzml), I run into errors while reading in the files:

Reading spectra file
Problem opening .mzML file data/dda/mzmls/TP17632AUH_Slot2-29_1_18113.mzML; Invalid URI: The format of the URI could not be determined.

The mzML can be parsed by other tools/parsers, and I don't get that error with other mzMLs converted from raw/mgf. Could you help me out here? I can also send you the file if necessary. I'm using FlashLFQ via docker (v1.2.4, but 1.2.5 and 1.2.6 give similar error messages) on Debian 12.

Cheers, Caro

trishorts commented 6 months ago

can you share the file and I will see what the problem is

Cajac102 commented 6 months ago

Here's the download link (3.2GB): https://drive.google.com/file/d/1ahK0zWZh7X8ABVzxRWmATaPgOJa0I6ho/view?usp=share_link

trishorts commented 5 months ago

ok. i found the problem and I was able to make a fix for the problem, but I don't understand the problem. The problem is the source file location string in two places near the top of the mzml image

if I replace the string with a file path (that doesn't have to be the actual file path), the file will load. image

This would work for a temporary fix. But you need a text editor that can open a huge file like Notepad++

I will still need to dig into this to understand and fix it and then get the update into a new release.

trishorts commented 5 months ago

as an internal note. we should add MS:1002828 and MS:1002817 and their companion strings to SourceFile GetSourceFile()

image

trishorts commented 5 months ago

this is the error we get during file read. image

trishorts commented 5 months ago

I created a new issue at the tdf2mzml github https://github.com/mafreitas/tdf2mzml/issues/23

Cajac102 commented 5 months ago

Thanks for the extensive answer! Adding a simple "/" at the start of the sourceFile location did indeed fix the issue.