issues
search
jsvine
/
pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.57k
stars
659
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
problem when extracting table without horizontal line
#339
lijuanLin
closed
3 years ago
2
Sort tables extracted on a page by their `top` position
#338
samkit-jain
closed
3 years ago
4
tables in a page getting wrong order
#336
gqh1995
closed
3 years ago
6
Unrecognized Font
#335
tscrosb
closed
3 years ago
1
No space between words in extracted text
#334
sivakumar05
closed
3 years ago
5
How to extract data from rectangles?
#333
mugiwara85
closed
3 years ago
1
STSong-light typeface not recognized
#332
flyingpig-yf
closed
3 years ago
5
Extract paragraphs
#331
janandreschweiger
closed
3 years ago
1
Can't find the last column of table
#330
shizidushu
closed
3 years ago
1
I can't extract mathematical expression texts.
#329
ai-motive
closed
3 years ago
1
Format tests according to psf/black and flake8
#327
jsvine
closed
3 years ago
2
Table detected on page with no visible tables
#326
ivoytov
closed
3 years ago
2
Output table is null
#325
zhiminliu
closed
3 years ago
2
Checking for duplicated pages
#324
belisarenata
closed
3 years ago
1
how to extrat table in the picture
#323
a417886
closed
3 years ago
3
Between Two Words --bug
#322
12jakubpavel
closed
3 years ago
6
Doesn't seem to be finding a table
#321
MarynaLongnickel
closed
3 years ago
2
Handle invalid metadata values
#320
samkit-jain
closed
3 years ago
4
Specifying setup.py encoding formats
#319
akaiuun12
closed
3 years ago
2
Can internal links be extracted?
#318
markfirmware
closed
3 years ago
3
Multiple lines are merged into one line
#317
trrk
closed
3 years ago
2
TypeError: 'PDFObjRef' object is not iterable
#316
Abdur-rahmaanJ
closed
3 years ago
6
upgrade the version of pdfminer.six
#315
bimmlerd
closed
2 years ago
8
Extract text without tables
#314
Spiritus44
closed
3 years ago
1
extract_text() from pages should not extract data within tables
#313
lfcbenson
closed
3 years ago
3
File-object given to PDF() and open() should not be closed
#312
the-vindicar
closed
3 years ago
1
unexpected/non-exist vertical/horizontal lines generated in tables which cause extract table result wrong
#311
guo1017138
closed
3 years ago
7
Does "page.to_image()" can be converted to numpy array or binary stream?
#310
lvbohui
closed
3 years ago
4
Option to allow tables which contain only one single cell
#309
Pique7
closed
3 years ago
2
cannot extract text properly from a cropped pdf
#308
leiyuwork
closed
3 years ago
2
How can I extract table without left and right vertical border correctly
#307
guo1017138
closed
3 years ago
2
using extract_text() still have the blank,while using .chars lose all the blanks
#306
BriskyGates
closed
3 years ago
2
An error was reported when using the "to_image" method.
#305
lvbohui
closed
3 years ago
2
pip install UnicodeDecodeError
#304
ShaneKao
closed
3 years ago
3
Performance issues when integrating pdfplumber in Scrapy
#303
niwreg-coder
closed
3 years ago
1
Add support fill value to all the cells belongs to the same merged cell
#302
guo1017138
closed
3 years ago
5
Pdf.crop is not working in 0.5.24
#301
ibrahimshuail
closed
3 years ago
5
How to extract text and tables from pdf pages and delete duplicate text of tables from the result text?
#300
cjmqwerty
closed
3 years ago
2
Extract header and Subheader
#299
ibrahimshuail
closed
3 years ago
8
Handle integer/floating type metadata values
#298
samkit-jain
closed
3 years ago
3
Decode Integer Metadata
#297
prgx-csmith01
closed
3 years ago
6
Reading order of multi-column document.
#296
lvbohui
closed
3 years ago
6
Issue : Can't able to extract line spacing between paragraph
#295
kkarthikvk
closed
3 years ago
1
Fix bug in `dedup_chars()` in which `._objects` was accessed before assignment
#294
samkit-jain
closed
3 years ago
2
AttributeError: 'Page' object has no attribute '_objects'
#293
samkit-jain
closed
3 years ago
2
Troubles with subscripts
#292
Fleur09
closed
3 years ago
4
Can't extract tables with lines and but without bound rect
#291
playgithub
closed
4 years ago
2
TypeError: unsupported operand type(s) for +: 'float' and 'decimal.Decimal'
#290
Petru-Tanas-ProcessITCS
closed
3 years ago
3
AttributeError: module 'pdfplumber' has no attribute 'open'
#289
lotfiabdelghafour
closed
3 years ago
8
Can't extract a table without lines
#288
playgithub
closed
4 years ago
2
Previous
Next