levitsky / pyteomics

Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
http://pyteomics.readthedocs.io
Apache License 2.0
105 stars 34 forks source link

Pyteomics fails to parse mzId files with <Fragmentation> information #33

Closed jgriss closed 3 years ago

jgriss commented 3 years ago

Hi,

First of all, thanks a lot for this great library!

I recently found that pyteomics can't open mzId files created using PeptideShaker. Iterating of the SpectrumIdentificationResult objects never starts without significant CPU usage.

After I manually removed the <Fragmentation>[...]</Fragmentation> tags from the file, everything worked fine.

Kind regards, Johannes

mobiusklein commented 3 years ago

Can you share the original mzIdentML file please? Or a reduced version if it is too large?

jgriss commented 3 years ago

Hi @mobiusklein

You can download the file here (it's too big for GitHub):

https://cloudius.meduniwien.ac.at/index.php/s/zt7lZBf4VMTMbwT

Creating the MzIdentML object works. But as soon as I start iterating over it, the iteration never starts.

Thanks a lot for the help!

Kind regards, Johannes

mobiusklein commented 3 years ago

@jgriss could you please try the branch in PR #34? It should fix your issue.