jsvine pdfplumber issues

jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

MIT License

6.57k stars 659 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'

#726 loganathanspr closed 1 year ago
4
Distinguish between bold and non-bold Fonts

#724 lycfight opened 2 years ago
6
The same table is distributed on two pages, and some data extraction fails

#720 AresElvis closed 2 years ago
0
Information about how to display without Jupyter

#716 josephernest closed 2 years ago
3
pdf plumber to_image( ) OSError: exception: access violation writing 0x0000000000000008

#713 jjjkuba closed 2 years ago
5
automatically make this space read as ""

#709 yihaoshumi closed 2 years ago
0
Mypy compatibility

#703 jhonatan-lopes closed 2 years ago
3
Extracting Z-Value of Rects/Items

#700 JosefJoubert closed 2 years ago
1
AttributeError: partially initialized module 'pdfplumber' has no attribute 'open' (most likely due to a circular import)

#699 lili1234567890 closed 2 years ago
6
pip whl missing `py.typed`

#698 jhonatan-lopes closed 2 years ago
6
ValueError: bytes must be in range(0, 256)` in page.chars

#695 bpugnaire closed 2 years ago
1
AttributeError: 'LTChar' object has no attribute 'graphicstate' trying to use the table function

#692 Da-vid21 closed 2 years ago
2
_itemgetter function removed from utils.py without deprecationWarning

#691 jfuruness closed 2 years ago
6
Handle `ValueError` exception when searching for text using regex

#687 samkit-jain closed 2 years ago
2
How get merged cells

#685 hbh112233abc opened 2 years ago
9
Page.search results bbox position can be wrong

#684 bpugnaire closed 2 years ago
4
Page.search ValueError: min() arg is an empty sequence

#683 bpugnaire closed 2 years ago
5
Consider punctuation when extracting words

#682 lolipopshock closed 2 years ago
3
pdfplumber will be hung up when open pdf which is damaged

#681 Gadil-1987 closed 2 years ago
8
I have a new problem

#680 Godlikemandyy closed 1 year ago
5
Consider punctuation when extracting words

#678 lolipopshock closed 2 years ago
0
Can I submit a Chinese document translated from README.md

#674 hbh112233abc closed 2 years ago
4
How to extract pdf texts which contains text and tables

#672 Godlikemandyy closed 2 years ago
1
Cannot get metadata nor pages from a scanned pdf without text

#669 BryanKoo closed 2 years ago
2
Identify contiguous rectangles with different fill colors (e.g. formatted table) as one Rectangle object

#668 moreproblems closed 2 years ago
1
Re-process a page

#664 lifepillar closed 2 years ago
2
Size measures char width when upright = False

#663 suedunham closed 2 years ago
1
ValueError when using debug_tablefinder

#659 rneumann7 closed 2 years ago
1
`extract_text(layout=True)` fails if PDF page contains no text

#658 ethanscorey closed 2 years ago
1
Add py.typed marker for PEP 561 compatibility

#657 jhonatan-lopes closed 2 years ago
6
'LTChar' object has no attribute 'graphicstate' Error in a docker container

#655 has-abi closed 2 years ago
9
How to read paragraph？

#654 nianfouyi closed 2 years ago
1
I have table in which a single cell of each row have 3 different columns classified with white space how can I get that column different in each row list

#650 manish291740 closed 2 years ago
1
No spaces extracted found in first_page.chars

#649 sanchez5674 closed 2 years ago
1
Feature/fix

#646 KehaoWu closed 2 years ago
2
page to_image() get stuck in ProcessPoolExecutor

#643 qyhou closed 2 years ago
2
`RecursionError: maximum recursion depth exceeded` in `utils.resolve_all`

#638 jtschoonhoven closed 2 years ago
6
Error

#636 Puneet0353 closed 2 years ago
2
Read pdf error on linux

#635 FANGOD closed 2 years ago
1
Add documentation re. common table-extraction challenges

#634 jsvine opened 2 years ago
0
Character merged incorrectly when using extract_words()

#627 datdao1998 closed 2 years ago
2
Wrong parsing on two columns pdf file

#620 doleron closed 2 years ago
6
font weight

#619 tonystark7cris closed 2 years ago
2
PDF Plumber not extracting tables correctly (text is parsed line by line)

#618 arthurthlee closed 2 years ago
0
After the pdfplumber program is packaged into an exe(py2exe), some pdfs cannot recognize the content

#615 StruggleYang closed 2 years ago
5
How to detect irregular table

#612 Godlikemandyy closed 2 years ago
0
pdfminer extracting text incorrectly

#611 IrinaMax closed 2 years ago
1
Link to ghostscript installation

#608 vmgottin closed 2 years ago
1
Unable to recognize lines of PDF

#607 diorw closed 2 years ago
1
extract_text misses spaces between words

#606 jtjohnston closed 2 years ago
6

Previous Next