-
Thanks for your work on this, it's amazing. I get the following error when opening some our PDF files - 'UglyToad.PdfPig.Core.PdfDocumentFormatException: 'Could not read the first token in the documen…
-
I have installed xpdftotext like this:
http://support.sphiderpro.eu/knowledgebase.php?article=13
then I have installed invoice2pdf like this:
https://github.com/m3nu/invoice2data
then I have install…
-
```py
(pdf2textt) C:\Users\korol\PycharmProjects\pdf2text>pip install pdftotext
Collecting pdftotext
Using cached pdftotext-2.1.6.tar.gz (99 kB)
Building wheels for collected packages: pdftotext…
Salz0 updated
2 years ago
-
It seems that setting "pdf_ocr" : true will send every PDF to tesseract, even if it's a text based PDF where the text can be easily extracted without using tesseract. Isn't there a method in place wh…
-
When I tried installing `pdfminer3k==1.0.4`, it couldn't find the version.
Instead, it listed versions 1.3.2, 1.3.3, 1.3.4.
So I ran `convert_directory_parallel` with 1.3.4, and I'm getting an err…
-
Hi Sébastien
first of all my compliments for your pdf2text library - very useful!
I am getting strange characters '??' in words like 'S??henkungen' - in thi case the '??' stands for 'c' - for german…
-
Currently nb uses pdftotext, which is suboptimal for PDF files contain anything other than simple text. Consider using https://github.com/dsanson/termpdf.py which displays rendered PDF in the terminal…
-
The nodejs example already include a conversion to SVG (but which does not produce usable output for me, because of fonts) and a conversion to PNG via canvas.
It would be great to also have an exampl…
kno10 updated
3 years ago
-
![image](https://user-images.githubusercontent.com/13721550/62819250-200d8900-bb70-11e9-8469-c8290c5b7a54.png)
In the extraction of PDF files an extra arrow character (↑) is present . Is it a symbo…
-
When trying to encode my pdf i get the following error
```
Object list not found. Possible secured file.
```
What does this mean?
If I use pdftotext from the command line, I'm able to output the t…