Closed sheppard closed 2 years ago
Hi @sheppard thanks for adding this PR! I have a few requests to consider please. If you don't expect to have the bandwidth to address these please let us know, and they could be done in a follow-up PR.
xlrd==2.0.1
and openpyxl==3.0.9
.defusedxml==0.7.1
so it is importable by openpyxl.os.environ["OPENPYXL_DEFUSEDXML"] = "True"
in xls2json_backends.py, as per: openpyxl src and docs.Merging #575 (05d5ac7) into master (66c0d5b) will decrease coverage by
0.61%
. The diff coverage is83.87%
.
@@ Coverage Diff @@
## master #575 +/- ##
==========================================
- Coverage 84.20% 83.59% -0.61%
==========================================
Files 25 25
Lines 3672 3761 +89
Branches 865 930 +65
==========================================
+ Hits 3092 3144 +52
- Misses 440 474 +34
- Partials 140 143 +3
Impacted Files | Coverage Δ | |
---|---|---|
pyxform/utils.py | 78.08% <72.41%> (-13.25%) |
:arrow_down: |
pyxform/xls2json.py | 79.46% <75.00%> (+0.06%) |
:arrow_up: |
pyxform/xls2json_backends.py | 77.43% <89.47%> (+1.10%) |
:arrow_up: |
pyxform/constants.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 66c0d5b...05d5ac7. Read the comment docs.
I updated the setup.py dependencies, added equivalency tests, and moved the imports to the module level in utils.py.
It looks like openpyxl's defusedxml integration is active by default as long as:
OPENPYXL_DEFUSEDXML
is either True
or not set. (In spite of the name, defusedxml_env_set()
returns True
if the key does not exist in the dictionary.)I confirmed this by adding a test to assertTrue(openpyxl.DEFUSEDXML)
in test_xls2json_backends.py.
Setting os.environ
would only be necessary to override the preferences of a user that had explicitly defined OPENPYXL_DEFUSEDXML=False
in their environment. In that case, the user would still be able to disable defusedxml by importing openpyxl before pyxform had a chance to update os.environ
.
Closes #501
Why is this the best possible solution? Were any other approaches considered?
See discussion in #501.
What are the regression risks?
There should be no noticeable changes for most users of pyxform. I started from a line-for-line translation of the xlrd integration and optimized some of the logic. One potential change is that dictionary keys will not contain consecutive inner spaces, since they are now populated from the cleaned headers rather than a separate step.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
There is a new dependency, but this should be transparent to most users.
Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passblack pyxform tests
to format code