issues
search
jsvine
/
pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
MIT License
6.57k
stars
659
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add encoding feature to page parsing
#287
valentynbez
closed
4 years ago
0
Can not extract any tables from some pdf in Chinese
#286
playgithub
closed
4 years ago
5
In full-lined table, two lines in one cell is recognized as two items
#285
playgithub
closed
4 years ago
15
Raising KeyError when I tried to extract text from PDF file
#284
coderliuhao
closed
3 years ago
3
chars x1 > page.width
#283
fdq09eca
closed
4 years ago
4
Update GitHub Actions to run on pull request events
#282
samkit-jain
closed
4 years ago
2
TableFinder not finding table
#281
ugoleonard
closed
3 years ago
4
fix duplicates in extract_text/extract_words/extract_tables
#280
xv44586
closed
4 years ago
3
Allow GitHub Actions workflow to run on pull requests from forked repos
#279
samkit-jain
closed
4 years ago
3
feature: Option to not striping whitespaces from texts of cells.
#278
AysegulKarcili
opened
4 years ago
1
Option to extract text of cells with leading and trailing white spaces preserved
#277
AysegulKarcili
opened
4 years ago
0
Fetch non tabular data from PDF
#276
coder0028
closed
4 years ago
4
how to know the page bounding box parameters?
#275
fdq09eca
closed
4 years ago
1
Strange characters when extracting text
#274
dasapa
closed
4 years ago
6
Pdfplumber not reading last row of table
#273
ffreller
closed
4 years ago
2
command not found issue
#272
Arwa200
closed
4 years ago
28
spisific data in PDF
#271
Arwa200
closed
4 years ago
1
python pdfplumber error converting pdf to jpg FailedToExecuteCommand `“gswin64c.exe”
#270
shyamalaspure
closed
4 years ago
1
page.extract_words() and page.extract_text() output is empty
#269
NeoWang9999
closed
4 years ago
4
Use the extract_table() method to parse out such a table
#268
wuliKingQin
closed
4 years ago
2
[DISCUSSION] Handling out-of-page rect objects
#267
samkit-jain
closed
4 years ago
2
Not able to crop the page, ValueError: bounding box has an area of zero
#266
zeina99
closed
4 years ago
2
Fewer horizontal lines when using text strategy
#265
samkit-jain
closed
3 years ago
8
Basic example
#264
flaprocha
closed
4 years ago
1
Memory Leakage
#263
cabudies
closed
2 years ago
4
Edged character is lost after extracting the table
#262
fdq09eca
closed
3 years ago
4
Helvetica typeface not recognized
#261
SherlockHua1995
closed
3 years ago
2
Refactor several complex methods and add `extra_attrs` to `.extract_words(...)`
#260
jsvine
closed
4 years ago
1
Text parsing error for a column occupying more than one row
#259
wuliKingQin
closed
4 years ago
10
Add comparisons to other Python PDF libraries
#258
jsvine
closed
4 years ago
0
Comparision of pdfplumber with related libraries
#257
MartinThoma
closed
4 years ago
2
Merge v0.5.23 into stable branch
#256
jsvine
closed
4 years ago
1
[README] Remove email address, add maintainer list
#255
jsvine
closed
4 years ago
1
Fix: Raise ValueError on crop w/ zero-overlap bbox
#254
jsvine
closed
4 years ago
1
Update GitHub Actions workflow
#253
samkit-jain
closed
4 years ago
9
Is there a different language pack for special characters
#252
vesko-vujovic
closed
4 years ago
7
make pdfplumber less verbose by default
#251
johnmathews
closed
4 years ago
5
TXWylie01a-FIN.pdf
#250
shravspy
closed
4 years ago
5
Text from pdf is not readable.
#249
SteveSmirnoff
closed
4 years ago
1
how to ignore the text that is at the edge of the page?
#248
fdq09eca
closed
4 years ago
2
First and last row of table with only horizontal line cannot be extract
#247
amouro
closed
4 years ago
3
Can't install pdfplumber
#246
gui1herme
closed
4 years ago
1
Inconsistent results when cropping an already cropped page
#245
samkit-jain
closed
4 years ago
5
extracting text from a two columns page
#244
fdq09eca
closed
4 years ago
16
Can't extract all text from one file
#243
SteveSmirnoff
closed
4 years ago
4
Text capturing without tabular
#242
ibrahimshuail
closed
4 years ago
16
Add `.annots` and `.hyperlinks`, replacing .annos
#241
jsvine
closed
4 years ago
1
Save pdf file after cropping a page
#240
AntonioMarsella
closed
4 years ago
1
Discussion - Better Table Extraction on "text" Strategy #238
#239
eran-pinhas
opened
4 years ago
1
Discussion - Better Table Extraction on "text" Strategy
#238
BenJacobs1
closed
4 years ago
0
Previous
Next