issues
search
jsvine
/
pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.57k
stars
659
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'
#726
loganathanspr
closed
1 year ago
4
Distinguish between bold and non-bold Fonts
#724
lycfight
opened
2 years ago
6
The same table is distributed on two pages, and some data extraction fails
#720
AresElvis
closed
2 years ago
0
Information about how to display without Jupyter
#716
josephernest
closed
2 years ago
3
pdf plumber to_image( ) OSError: exception: access violation writing 0x0000000000000008
#713
jjjkuba
closed
2 years ago
5
automatically make this space read as ""
#709
yihaoshumi
closed
2 years ago
0
Mypy compatibility
#703
jhonatan-lopes
closed
2 years ago
3
Extracting Z-Value of Rects/Items
#700
JosefJoubert
closed
2 years ago
1
AttributeError: partially initialized module 'pdfplumber' has no attribute 'open' (most likely due to a circular import)
#699
lili1234567890
closed
2 years ago
6
pip whl missing `py.typed`
#698
jhonatan-lopes
closed
2 years ago
6
ValueError: bytes must be in range(0, 256)` in page.chars
#695
bpugnaire
closed
2 years ago
1
AttributeError: 'LTChar' object has no attribute 'graphicstate' trying to use the table function
#692
Da-vid21
closed
2 years ago
2
_itemgetter function removed from utils.py without deprecationWarning
#691
jfuruness
closed
2 years ago
6
Handle `ValueError` exception when searching for text using regex
#687
samkit-jain
closed
2 years ago
2
How get merged cells
#685
hbh112233abc
opened
2 years ago
9
Page.search results bbox position can be wrong
#684
bpugnaire
closed
2 years ago
4
Page.search ValueError: min() arg is an empty sequence
#683
bpugnaire
closed
2 years ago
5
Consider punctuation when extracting words
#682
lolipopshock
closed
2 years ago
3
pdfplumber will be hung up when open pdf which is damaged
#681
Gadil-1987
closed
2 years ago
8
I have a new problem
#680
Godlikemandyy
closed
1 year ago
5
Consider punctuation when extracting words
#678
lolipopshock
closed
2 years ago
0
Can I submit a Chinese document translated from README.md
#674
hbh112233abc
closed
2 years ago
4
How to extract pdf texts which contains text and tables
#672
Godlikemandyy
closed
2 years ago
1
Cannot get metadata nor pages from a scanned pdf without text
#669
BryanKoo
closed
2 years ago
2
Identify contiguous rectangles with different fill colors (e.g. formatted table) as one Rectangle object
#668
moreproblems
closed
2 years ago
1
Re-process a page
#664
lifepillar
closed
2 years ago
2
Size measures char width when upright = False
#663
suedunham
closed
2 years ago
1
ValueError when using debug_tablefinder
#659
rneumann7
closed
2 years ago
1
`extract_text(layout=True)` fails if PDF page contains no text
#658
ethanscorey
closed
2 years ago
1
Add py.typed marker for PEP 561 compatibility
#657
jhonatan-lopes
closed
2 years ago
6
'LTChar' object has no attribute 'graphicstate' Error in a docker container
#655
has-abi
closed
2 years ago
9
How to read paragraph?
#654
nianfouyi
closed
2 years ago
1
I have table in which a single cell of each row have 3 different columns classified with white space how can I get that column different in each row list
#650
manish291740
closed
2 years ago
1
No spaces extracted found in first_page.chars
#649
sanchez5674
closed
2 years ago
1
Feature/fix
#646
KehaoWu
closed
2 years ago
2
page to_image() get stuck in ProcessPoolExecutor
#643
qyhou
closed
2 years ago
2
`RecursionError: maximum recursion depth exceeded` in `utils.resolve_all`
#638
jtschoonhoven
closed
2 years ago
6
Error
#636
Puneet0353
closed
2 years ago
2
Read pdf error on linux
#635
FANGOD
closed
2 years ago
1
Add documentation re. common table-extraction challenges
#634
jsvine
opened
2 years ago
0
Character merged incorrectly when using extract_words()
#627
datdao1998
closed
2 years ago
2
Wrong parsing on two columns pdf file
#620
doleron
closed
2 years ago
6
font weight
#619
tonystark7cris
closed
2 years ago
2
PDF Plumber not extracting tables correctly (text is parsed line by line)
#618
arthurthlee
closed
2 years ago
0
After the pdfplumber program is packaged into an exe(py2exe), some pdfs cannot recognize the content
#615
StruggleYang
closed
2 years ago
5
How to detect irregular table
#612
Godlikemandyy
closed
2 years ago
0
pdfminer extracting text incorrectly
#611
IrinaMax
closed
2 years ago
1
Link to ghostscript installation
#608
vmgottin
closed
2 years ago
1
Unable to recognize lines of PDF
#607
diorw
closed
2 years ago
1
extract_text misses spaces between words
#606
jtjohnston
closed
2 years ago
6
Previous
Next