pymzml / pymzML

pymzML - an interface between Python and mzML Mass spectrometry Files
https://pymzml.readthedocs.io/en/latest/
MIT License
158 stars 91 forks source link

Non-Centroid Data Errors #332

Closed jpruyne-mw closed 9 months ago

jpruyne-mw commented 1 year ago

I'm trying to work with Orbitrap data collected in profile mode. But when running most functions I'm getting errors based on the data not being formatted correctly.

In particular on the Spectrum object has_peak gives an error on 723 in spec.py for mz, i in self.peaks("centroided"): throws a: ValueError: too many values to unpack (expected 2).

Similarly when running extreme_values('i') (or 'mz') it produces: IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed on line 1490 - all_i_values = self.peaks("raw")[:, 1]

Am I missing some function to artificially centroid the data?

Here is the example data: https://drive.google.com/file/d/1JW0lDVbLg6N94FO3_bWZCJfhHw0fCQrv/view?usp=sharing

And current code I'm trying to run:

import pymzml as pz

run = pz.run.Reader("003_LabTest1-plate1_A1.mzML", MS_precisions =  {
        1 : 5e-6
    })
for n, spec in enumerate(run):
    print(f"{n} {spec.has_peak(147.07642)}")
fu commented 1 year ago

We are looking into it.

jpruyne-mw commented 1 year ago

Unsure if this is helpful but when I pull the mz or intensity data it is still encoded in its byte format.

jpruyne-mw commented 1 year ago

Did some additional digging, it looks like we were encoding the files with "MS-Numpress linear prediction compression followed by zlib compression" but also only 32bit float. Switching to 64 bit, zlib only allowed me to parse as normal.