jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.02k stars 619 forks source link

Accept Iterable for geometry utils (fixes #945) #946

Closed dhdaines closed 11 months ago

dhdaines commented 11 months ago

This also makes bbox manipulation slightly faster as it no longer has to iterate 4 times over the same list.

Note: this removes the fragile-looking type manipulation in intersects_bbox as making it work with iterables is very ugly and I am not convinced of its usefulness (none of the other similar functions have it... I will let more experienced users/maintainers review and see if it is necessary!)

jsvine commented 11 months ago

Thanks @dhdaines! This broadly sounds like a good change. Looks like the linter is angry, though. Could you run this locally, and then adjust until make format lint tests fully passes?:

pip install -r requirements-dev.txt
make format lint tests

Note: this removes the fragile-looking type manipulation in intersects_bbox as making it work with iterables is very ugly and I am not convinced of its usefulness (none of the other similar functions have it... I will let more experienced users/maintainers review and see if it is necessary!)

Ah, thanks for flagging! I think this might be a relic of some old code, when I was trying to let all the utils accept pandas DataFrames in addition to lists of dicts.

dhdaines commented 11 months ago

Hmm, I could have sworn that make lint and make tests passed, it could be that I have the wrong versions of isort and black...

There's a problem, at least for me, which is that mypy does not like the recent version of numpy that gets installed under Python 3.10:

venv/bin/python -m mypy --strict --implicit-reexport pdfplumber
venv/lib/python3.10/site-packages/numpy/__init__.pyi:638: error: Positional-only parameters are only supported in Python 3.8 and greater

This turns out to actually be a fixed bug in mypy (https://github.com/python/mypy/issues/13627), so I've updated requirements-dev.txt to have the version of mypy with the fix.

codecov[bot] commented 11 months ago

Codecov Report

Merging #946 (9014a29) into develop (28c0afc) will not change coverage. Report is 5 commits behind head on develop. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##           develop      #946   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           18        18           
  Lines         1585      1584    -1     
=========================================
- Hits          1585      1584    -1     
Files Changed Coverage Δ
pdfplumber/utils/geometry.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes