saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
307 stars 50 forks source link

fix xmltodict parse #159

Closed zhoulytwinyu closed 2 years ago

zhoulytwinyu commented 2 years ago

Description: xmltodict 0.13.0 dropped OrderedDict as the default dict constructor for python>=3.7: https://github.com/martinblech/xmltodict/compare/v0.12.0...v0.13.0. python3.7 dicts are by specification, ordered. This breaks line 453 of sraweb.py, which tests isinstance of OrderedDict. To fix this, we can pass an explicit argument dict_constructor=OrderedDict to xmltodict.parse.

Error reproduction (without the fix):

python -m venv venv
pip install numpy Cython pysradb
pysradb gsm-to-srr GSM2177186
# Error Message:
# UnboundLocalError: local variable 'exp_platform_model' referenced before assignment

Env: python 3.7.6 (or python 3.8.10)

Other discussions: The way xmltodict breaks backward compatibility and the way we specify 'xmltodict>=0.12.0' both contributed to this unexpected error. We may want to fix the minor version of xmltodict and only allow auto-upgrade of patch version: e.g. 'xmltodict~=0.13.0'. Of course, the other way to fix this is to drop OrderedDict in pysradb for ordinary dict. More lines would be involved. I would opt to do that in a refactor PR.

codecov[bot] commented 2 years ago

Codecov Report

Merging #159 (e43cf40) into master (5e556d2) will increase coverage by 0.02%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #159      +/-   ##
==========================================
+ Coverage   58.00%   58.02%   +0.02%     
==========================================
  Files           8        8              
  Lines        1800     1801       +1     
==========================================
+ Hits         1044     1045       +1     
  Misses        756      756              
Impacted Files Coverage Δ
pysradb/sraweb.py 80.96% <100.00%> (+0.03%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5e556d2...e43cf40. Read the comment docs.

saketkc commented 2 years ago

Thanks a lot @zhoulytwinyu - I totally missed this! Will make a release soon.

zhoulytwinyu commented 2 years ago

Thanks for the quick response. This is a very (more) useful and reliable library to complement entrez and sratoolkit. Creating a PR is the least I can do to support it.