levitsky / pyteomics

Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
http://pyteomics.readthedocs.io
Apache License 2.0
115 stars 35 forks source link

KeyError: 'mzTab-version' when loading PXD000001_mztab.txt #26

Closed PDiracDelta closed 3 years ago

PDiracDelta commented 3 years ago

I suspect this is an old mztab file that does not have the mzTab-version information? Any chance this can be made backwards-compatible?

import ppx
import pandas as pd
from pyteomics import mztab

dat = ppx.PXDataset('PXD000001')
dat.download('PXD000001_mztab.txt', dest_dir='.')
mzt = mztab.MzTab('PXD000001_mztab.txt')

Traceback (most recent call last): File "", line 1, in File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 192, in init self._determine_schema_version() File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 381, in _determine_schema_version version_parsed, variant = re.search(r"(?P\d+.\d+.\d+)(?:-(?P[MP]))?", self.version).groups() File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 201, in version return self.metadata['mzTab-version'] KeyError: 'mzTab-version'

mobiusklein commented 3 years ago

I think #27 will address this. Could you give it a try?

PDiracDelta commented 3 years ago

I think #27 will address this. Could you give it a try?

Thanks for the quick response. Not entirely solved yet, I'm afraid. The same code now returns: Traceback (most recent call last):

  File "<input>", line 1, in <module>
  File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 641, in __init__
    self._determine_schema_version()
  File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 747, in _determine_schema_version
    version_parsed, variant = re.search(r"(?P<schema_version>\d+(?:\.\d+(?:\.\d+)?)?)(?:-(?P<schema_variant>[MP]))?", str(self.version)).groups()
  File "/REDACTED/.conda/envs/qcquan/lib/python3.6/site-packages/pyteomics/mztab.py", line 84, in __get__
    if value is None and self.variant_required and obj.variant in self.variant_required:
AttributeError: 'MzTab' object has no attribute 'variant'
mobiusklein commented 3 years ago

Ah right. variant must be set before it may be read by the validation logic. I've updated that PR with new code to ensure that, and added a test case that protects against this.

PDiracDelta commented 3 years ago

Ah right. variant must be set before it may be read by the validation logic. I've updated that PR with new code to ensure that, and added a test case that protects against this.

Thanks, this seems to work! (FYI: the pull request wasn't accepted yet - I copied it directly from the commit).