issues
search
jsvine
/
pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.57k
stars
659
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Cannot close pdf file after calling page.extract_text() on it
#446
maciej44
closed
3 years ago
3
to_image of grey text results in a fully white image
#443
linuxsoftware
closed
2 years ago
3
How to extract tables in scientific articles
#438
Roy-Kid
closed
3 years ago
3
Unable to extract text from table
#431
xhdavid
closed
3 years ago
3
Add first draft of CONTRIBUTING.md
#428
jsvine
closed
3 years ago
2
to_image function does not include symbols like "+","=" etc
#427
safwanolaimat
closed
3 years ago
2
..
#426
kaitdsdev01
closed
3 years ago
0
Text extraction generated words have location that is completely off from where they actually are present
#425
sreeni5493
closed
3 years ago
11
Decode text issue - (cid:49)(cid:52)(cid:56)(cid:44)(cid:56) instead of text
#424
areqq
closed
3 years ago
1
Unable to extract more than 1 table from a page
#423
msakthiganesh
closed
3 years ago
2
Duplicate value for merged cell instead of `None`
#422
tungph
opened
3 years ago
4
text extraction fails when cropping page to page height and width for certain PDFs
#421
sreeni5493
closed
2 years ago
4
Duplicate value for merged cell instead of `None`
#420
tungph
opened
3 years ago
3
no root object error
#419
shm007g
closed
3 years ago
4
Pass kwds args to Table.extract from Page.extract_table
#415
trifling
closed
3 years ago
2
The visualization result is right,but the extraction result is wrong!
#414
OK-JH
closed
2 years ago
2
Visual debugging raise ValueError: Decompressed Data Too Large
#413
holytony
closed
2 years ago
6
conflict with pdfmine3k in installing with pip3
#409
aaronzhengwl
closed
3 years ago
1
wapped string in a cell cannot be read correctly
#408
aaronzhengwl
closed
3 years ago
3
Add --laparams to CLI (and make related tweaks)
#407
jsvine
closed
3 years ago
1
Handling curved characters and extracting words for curved characters
#404
sreeni5493
closed
3 years ago
2
Annotation coordinates mismatch on landscape-oriented pages
#403
joesmith0
closed
2 years ago
5
extract_words sometimes extracts chracters from multiple lines and forms them as words
#400
sreeni5493
closed
2 years ago
4
Why pdfplumber is giving table as a list [while we extracting table from the pdf file]
#399
bharath-kumarn
closed
3 years ago
1
PSKeyword' object has no attribute 'decode
#398
HTransistor
closed
3 years ago
6
How to extract objects in PDF?
#397
situchen
closed
3 years ago
1
readpdf table result is None and text is None
#396
laizezhong
closed
3 years ago
2
Page 1 text extracted in Page 2 too
#395
sreeni5493
closed
3 years ago
2
Word extraction for non 0 degree characters is extracting characters and not combining characters to word when "Size" parameter is used
#392
sreeni5493
closed
3 years ago
4
relative=True in page.extract_text() not working
#391
LiutongZhou
closed
3 years ago
9
crop + extract_text() raises KeyError when laparams is not set to None in pdfplumber.open
#390
LiutongZhou
closed
2 years ago
2
Explain `top`/`doctop`/`bottom` vs. `y0`/`y1` in README.md
#389
jsvine
opened
3 years ago
0
Even if laparams is set, don't extract `anno` objs
#388
jsvine
closed
3 years ago
3
Add guidelines for submitting pull requests
#387
jsvine
closed
3 years ago
3
Passing something other than list to `extract_text` results in error
#386
alexreg
closed
3 years ago
1
Fixed small issue with ordering of statements in `extract_text`
#385
alexreg
closed
3 years ago
5
`KeyError` raised when `laparams` set
#383
alexreg
closed
3 years ago
3
Can we get type of line,rect? (Dotted, Non Dotted, Empty box rects)
#382
sreeni5493
opened
3 years ago
9
Can we have exclude box in Page.Crop functionality
#369
sreeni5493
closed
2 years ago
2
Font properties for word and characters
#368
sreeni5493
closed
3 years ago
9
Update GA workflow to build python package
#365
samkit-jain
closed
3 years ago
1
Re-add textboxhorizontal/etc. when laparams (#359)
#364
jsvine
closed
3 years ago
2
Prevent pip from installing the project on Python versions 3.5 and lower
#363
samkit-jain
closed
3 years ago
4
Error importing in ubuntu
#362
ghost
closed
3 years ago
4
Missing pdfminer layout related objects "textboxhorizontal" and "textlinehorizontal"
#359
frascuchon
closed
3 years ago
2
is it capable to extract no-explicit-line table?
#358
sidaliu1014
closed
3 years ago
1
Is there a way to extract only text without tables ?
#357
MOHAMED-ENSA
closed
3 years ago
1
explicit_strategy issue found for few PDF files
#356
ibrahimshuail
closed
3 years ago
12
Rect edges can't be extracted correctly from some pdf
#343
jackstraws
closed
3 years ago
4
Rework the alpha conversion to remove jaggies
#340
arlyon
closed
3 years ago
4
Previous
Next