pymzml / pymzML

pymzML - an interface between Python and mzML Mass spectrometry Files
https://pymzml.readthedocs.io/en/latest/
MIT License
158 stars 91 forks source link

error when trying to get spectrum #303

Closed hanghu1024 closed 1 year ago

hanghu1024 commented 1 year ago

Describe the bug I am working on a mzml data converted from sciex .wiff file. I was able to load the data file with pymzml, return a spectrum object, and get spec.TIC, ID, scan_time, etc. But I got error message when I tried to get spectrum using spec.peaks("centroided")/spec.mz()/spec.i(). I was able to get spectrum data by using pyopenms with the same file. I have uploaded the data and Jupyter notebook, could you please have a look? Many thanks!

To Reproduce data file and Jypyter notebook are attached.

Expected behavior

### get spectrum peaks
spec.peaks("centroided")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pymzml\spec.py in _get_encoding_parameters(self, array_type)
    200                             ns=self.ns,
--> 201                             Acc=self.calling_instance.OT["32-bit float"]["id"],
    202                         )

AttributeError: 'NoneType' object has no attribute 'OT'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-9-5d24ed73fd60> in <module>
      1 ### get spectrum peaks
----> 2 spec.peaks("centroided")

~\Anaconda3\lib\site-packages\pymzml\spec.py in peaks(self, peak_type)
   1046         if self._peak_dict[peak_type] is None:
   1047             if self._peak_dict["raw"] is None:
-> 1048                 mz_params = self._get_encoding_parameters("m/z array")
   1049                 i_params = self._get_encoding_parameters("intensity array")
   1050                 mz = self._decode(*mz_params)

~\Anaconda3\lib\site-packages\pymzml\spec.py in _get_encoding_parameters(self, array_type)
    207                         float_type_string.format(
    208                             ns=self.ns,
--> 209                             Acc=self.calling_instance.OT["64-bit float"]["id"],
    210                         )
    211                     ).get("name")

AttributeError: 'NoneType' object has no attribute 'OT'

Desktop (please complete the following information):

Files mzml&ipynb.zip

StSchulze commented 1 year ago

Hi @hanghu1024 ,

Thanks for adding the Jupyter notebook and example mzml, that definitely helps a lot for solving the issue.

You are accessing the spectrum using spec = run.info['file_object'][identifier] Unfortunately, that is not accessing the spectrum class correctly, so you have problems with the subsequent commands to access the spectrum attributes.

Instead, try accessing the spectrum e.g. like this: spec = run[identifier] The following then works perfectly fine

spec.peaks("centroided")
spec.peaks("raw")
spec.mz
spec.i

(note: spec.mz and spec.i should not have () )

You can also access the specs directly while iterating through your run

for n, spec in enumerate(run):
    print(n)
    print(
        "Spectrum {0}, MS level {ms_level} @ RT {scan_time:1.2f}".format(
            spec.ID, ms_level=spec.ms_level, scan_time=spec.scan_time_in_minutes()
        )
    )
    print(spec.peaks('centroided'))

Also note that with mzML files that come from sciex .wiff files, the indexing doesn't necessarily work properly. So make sure to use spec.id_dict, spec.index, or spec.ID depending on what information you're looking for.

I hope that helps, otherwise let us know and we're happy to try to solve it further.

hanghu1024 commented 1 year ago

Thank you so much @StSchulze for response and corrections! I am closing this issue.