jsvine pdfplumber issues

jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

MIT License

6.57k stars 659 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Cannot close pdf file after calling page.extract_text() on it

#446 maciej44 closed 3 years ago
3
to_image of grey text results in a fully white image

#443 linuxsoftware closed 2 years ago
3
How to extract tables in scientific articles

#438 Roy-Kid closed 3 years ago
3
Unable to extract text from table

#431 xhdavid closed 3 years ago
3
Add first draft of CONTRIBUTING.md

#428 jsvine closed 3 years ago
2
to_image function does not include symbols like "+","=" etc

#427 safwanolaimat closed 3 years ago
2
..

#426 kaitdsdev01 closed 3 years ago
0
Text extraction generated words have location that is completely off from where they actually are present

#425 sreeni5493 closed 3 years ago
11
Decode text issue - (cid:49)(cid:52)(cid:56)(cid:44)(cid:56) instead of text

#424 areqq closed 3 years ago
1
Unable to extract more than 1 table from a page

#423 msakthiganesh closed 3 years ago
2
Duplicate value for merged cell instead of `None`

#422 tungph opened 3 years ago
4
text extraction fails when cropping page to page height and width for certain PDFs

#421 sreeni5493 closed 2 years ago
4
Duplicate value for merged cell instead of `None`

#420 tungph opened 3 years ago
3
no root object error

#419 shm007g closed 3 years ago
4
Pass kwds args to Table.extract from Page.extract_table

#415 trifling closed 3 years ago
2
The visualization result is right，but the extraction result is wrong！

#414 OK-JH closed 2 years ago
2
Visual debugging raise ValueError: Decompressed Data Too Large

#413 holytony closed 2 years ago
6
conflict with pdfmine3k in installing with pip3

#409 aaronzhengwl closed 3 years ago
1
wapped string in a cell cannot be read correctly

#408 aaronzhengwl closed 3 years ago
3
Add --laparams to CLI (and make related tweaks)

#407 jsvine closed 3 years ago
1
Handling curved characters and extracting words for curved characters

#404 sreeni5493 closed 3 years ago
2
Annotation coordinates mismatch on landscape-oriented pages

#403 joesmith0 closed 2 years ago
5
extract_words sometimes extracts chracters from multiple lines and forms them as words

#400 sreeni5493 closed 2 years ago
4
Why pdfplumber is giving table as a list [while we extracting table from the pdf file]

#399 bharath-kumarn closed 3 years ago
1
PSKeyword' object has no attribute 'decode

#398 HTransistor closed 3 years ago
6
How to extract objects in PDF？

#397 situchen closed 3 years ago
1
readpdf table result is None and text is None

#396 laizezhong closed 3 years ago
2
Page 1 text extracted in Page 2 too

#395 sreeni5493 closed 3 years ago
2
Word extraction for non 0 degree characters is extracting characters and not combining characters to word when "Size" parameter is used

#392 sreeni5493 closed 3 years ago
4
relative=True in page.extract_text() not working

#391 LiutongZhou closed 3 years ago
9
crop + extract_text() raises KeyError when laparams is not set to None in pdfplumber.open

#390 LiutongZhou closed 2 years ago
2
Explain `top`/`doctop`/`bottom` vs. `y0`/`y1` in README.md

#389 jsvine opened 3 years ago
0
Even if laparams is set, don't extract `anno` objs

#388 jsvine closed 3 years ago
3
Add guidelines for submitting pull requests

#387 jsvine closed 3 years ago
3
Passing something other than list to `extract_text` results in error

#386 alexreg closed 3 years ago
1
Fixed small issue with ordering of statements in `extract_text`

#385 alexreg closed 3 years ago
5
`KeyError` raised when `laparams` set

#383 alexreg closed 3 years ago
3
Can we get type of line,rect? (Dotted, Non Dotted, Empty box rects)

#382 sreeni5493 opened 3 years ago
9
Can we have exclude box in Page.Crop functionality

#369 sreeni5493 closed 2 years ago
2
Font properties for word and characters

#368 sreeni5493 closed 3 years ago
9
Update GA workflow to build python package

#365 samkit-jain closed 3 years ago
1
Re-add textboxhorizontal/etc. when laparams (#359)

#364 jsvine closed 3 years ago
2
Prevent pip from installing the project on Python versions 3.5 and lower

#363 samkit-jain closed 3 years ago
4
Error importing in ubuntu

#362 ghost closed 3 years ago
4
Missing pdfminer layout related objects "textboxhorizontal" and "textlinehorizontal"

#359 frascuchon closed 3 years ago
2
is it capable to extract no-explicit-line table?

#358 sidaliu1014 closed 3 years ago
1
Is there a way to extract only text without tables ?

#357 MOHAMED-ENSA closed 3 years ago
1
explicit_strategy issue found for few PDF files

#356 ibrahimshuail closed 3 years ago
12
Rect edges can't be extracted correctly from some pdf

#343 jackstraws closed 3 years ago
4
Rework the alpha conversion to remove jaggies

#340 arlyon closed 3 years ago
4

Previous Next