jazzband / tablib

Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
https://tablib.readthedocs.io/
MIT License
4.59k stars 590 forks source link

Optimize xlsx detection #448

Closed claudep closed 4 years ago

claudep commented 4 years ago

Reading the whole file is a bit too much to detect if the file looks like an xlsx file.

hugovk commented 4 years ago

Sounds reasonable.

I don't suppose this is testable directly? It should successfully load a file before the change, and it should load a file afterwards too... Would be good at least to check that the changed lines are covered by tests (and https://github.com/jazzband/tablib/pull/449 should fix coverage reports in PRs).

claudep commented 4 years ago

I think all detection code is tested in test_auto_format_detect (and there's even a specific test_xlsx_format_detect). I'll rebase and force push to see if there's now a coverage diff report.

codecov[bot] commented 4 years ago

Codecov Report

Merging #448 into master will increase coverage by <.01%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #448      +/-   ##
=========================================
+ Coverage   90.39%   90.4%   +<.01%     
=========================================
  Files          28      28              
  Lines        2583    2585       +2     
=========================================
+ Hits         2335    2337       +2     
  Misses        248     248
Impacted Files Coverage Δ
src/tablib/formats/_xlsx.py 96.73% <100%> (+0.07%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8d02934...64a43f6. Read the comment docs.