Closed mobiotwin closed 11 months ago
This issue indicates that the mzML file may not be well-formed. The error message means that one of the binary arrays are encoded with one size binary array data type and then labeled as another binary array data type.
To diagnose this issue, we'd need to be able to access the mzML file. If that's not easily done, you could try initializing the reader with the following:
import numpy as np
from pyteomics import mzml
array_types = {
"m/z array": np.float64,
"intensity array": np.float64, # or np.float32
"raw ion mobility array": np.float32
}
with mzml.MzML(
"../../mzML/20230714_6Mix_0_NA_MS2_RP_HDMSe_POS_N02.mzML", dtype=array_types) as reader:
for spectrum in reader:
print(spectrum)
From looking at the spectra you've printed the intensity array looks like the one that is most likely labeled incorrectly, given how it swings wildly from very small to very large (negative) numbers.
How was this mzML file created?
Hi @mobiusklein
Thank you for your reply
Here is the link to download the file. 20230714_6Mix_0_NA_MS2_RP_HDMSe_POS_N02.mzML
The file is created via MSConvert (docker image) with the following parameters
lock_mass = LOCK_MASS_POSITIVE
arguments = ""
arguments += f"wine msconvert {file} "
arguments += "--outdir /out_data "
arguments += "--mzML " # write mzML format [default]
arguments += "--32 " # set default binary encoding to 32-bit precision
arguments += "--combineIonMobilitySpectra "
arguments += f"""--filter "lockmassRefiner mz={lock_mass} tol=0.5" """
arguments += """--filter "msLevel 1" """
FYI, I was able to read it with pyopenms
Odd, I was able to read the file successfully beyond the third spectrum. I noticed that the intensity array I'm seeing is not oscillating between huge negative and positive numbers, and that it appears to double-compressed:
<cvParam cvRef="MS" accession="MS:1000521" name="32-bit float" value=""/>
<cvParam cvRef="MS" accession="MS:1002748" name="MS-Numpress short logged float compression followed by zlib compression" value=""/>
<cvParam cvRef="MS" accession="MS:1000515" name="intensity array" value="" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of detector counts"/>
which would also cause the problem too. Could you please try upgrading to the latest version of pynumpress
and seeing if that fixes the problem for you?
Hi @mobiusklein ,
yes it seems working, it turns out that I only install
pip install pyteomics[xml]
which was working with data without ion mobility.
Thanks for your help
Hi,
We are planing to switch from pyopenms.
However, I get an issue loading mzML file with ion mobility data.
My code:
it iterates for the first two indices and then crashes
the output of the first 2 indices:
Then I got the following error
pyteomics version