jsvine pdfplumber issues

jsvine / pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

MIT License

6.57k stars 659 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add encoding feature to page parsing

#287 valentynbez closed 4 years ago
0
Can not extract any tables from some pdf in Chinese

#286 playgithub closed 4 years ago
5
In full-lined table, two lines in one cell is recognized as two items

#285 playgithub closed 4 years ago
15
Raising KeyError when I tried to extract text from PDF file

#284 coderliuhao closed 3 years ago
3
chars x1 > page.width

#283 fdq09eca closed 4 years ago
4
Update GitHub Actions to run on pull request events

#282 samkit-jain closed 4 years ago
2
TableFinder not finding table

#281 ugoleonard closed 3 years ago
4
fix duplicates in extract_text/extract_words/extract_tables

#280 xv44586 closed 4 years ago
3
Allow GitHub Actions workflow to run on pull requests from forked repos

#279 samkit-jain closed 4 years ago
3
feature: Option to not striping whitespaces from texts of cells.

#278 AysegulKarcili opened 4 years ago
1
Option to extract text of cells with leading and trailing white spaces preserved

#277 AysegulKarcili opened 4 years ago
0
Fetch non tabular data from PDF

#276 coder0028 closed 4 years ago
4
how to know the page bounding box parameters?

#275 fdq09eca closed 4 years ago
1
Strange characters when extracting text

#274 dasapa closed 4 years ago
6
Pdfplumber not reading last row of table

#273 ffreller closed 4 years ago
2
command not found issue

#272 Arwa200 closed 4 years ago
28
spisific data in PDF

#271 Arwa200 closed 4 years ago
1
python pdfplumber error converting pdf to jpg FailedToExecuteCommand `“gswin64c.exe”

#270 shyamalaspure closed 4 years ago
1
page.extract_words() and page.extract_text() output is empty

#269 NeoWang9999 closed 4 years ago
4
Use the extract_table() method to parse out such a table

#268 wuliKingQin closed 4 years ago
2
[DISCUSSION] Handling out-of-page rect objects

#267 samkit-jain closed 4 years ago
2
Not able to crop the page, ValueError: bounding box has an area of zero

#266 zeina99 closed 4 years ago
2
Fewer horizontal lines when using text strategy

#265 samkit-jain closed 3 years ago
8
Basic example

#264 flaprocha closed 4 years ago
1
Memory Leakage

#263 cabudies closed 2 years ago
4
Edged character is lost after extracting the table

#262 fdq09eca closed 3 years ago
4
Helvetica typeface not recognized

#261 SherlockHua1995 closed 3 years ago
2
Refactor several complex methods and add `extra_attrs` to `.extract_words(...)`

#260 jsvine closed 4 years ago
1
Text parsing error for a column occupying more than one row

#259 wuliKingQin closed 4 years ago
10
Add comparisons to other Python PDF libraries

#258 jsvine closed 4 years ago
0
Comparision of pdfplumber with related libraries

#257 MartinThoma closed 4 years ago
2
Merge v0.5.23 into stable branch

#256 jsvine closed 4 years ago
1
[README] Remove email address, add maintainer list

#255 jsvine closed 4 years ago
1
Fix: Raise ValueError on crop w/ zero-overlap bbox

#254 jsvine closed 4 years ago
1
Update GitHub Actions workflow

#253 samkit-jain closed 4 years ago
9
Is there a different language pack for special characters

#252 vesko-vujovic closed 4 years ago
7
make pdfplumber less verbose by default

#251 johnmathews closed 4 years ago
5
TXWylie01a-FIN.pdf

#250 shravspy closed 4 years ago
5
Text from pdf is not readable.

#249 SteveSmirnoff closed 4 years ago
1
how to ignore the text that is at the edge of the page?

#248 fdq09eca closed 4 years ago
2
First and last row of table with only horizontal line cannot be extract

#247 amouro closed 4 years ago
3
Can't install pdfplumber

#246 gui1herme closed 4 years ago
1
Inconsistent results when cropping an already cropped page

#245 samkit-jain closed 4 years ago
5
extracting text from a two columns page

#244 fdq09eca closed 4 years ago
16
Can't extract all text from one file

#243 SteveSmirnoff closed 4 years ago
4
Text capturing without tabular

#242 ibrahimshuail closed 4 years ago
16
Add `.annots` and `.hyperlinks`, replacing .annos

#241 jsvine closed 4 years ago
1
Save pdf file after cropping a page

#240 AntonioMarsella closed 4 years ago
1
Discussion - Better Table Extraction on "text" Strategy #238

#239 eran-pinhas opened 4 years ago
1
Discussion - Better Table Extraction on "text" Strategy

#238 BenJacobs1 closed 4 years ago
0

Previous Next