-
I am running this command on a 25 Mo input PDF file :
```
pdf2txt.py -S -t xml -o pdfMinerOutput.xml input.pdf
```
It crashes with this stack trace :
```
Traceback (most recent call last):
File "…
-
我删除了pytorch==2.0.1 修改codescikit-learn==1.3.0为 scikit-learn==1.3.0,要不然这两个下载会报错,但是这种请路况下虽然执行 pip install -r requirements.txt 成功,但是执行python pdf2txt.py -i "input_path" -o "output_file" 命令时报错:safetensors_r…
-
#164 This issue still remains unresolved on Win10 - python 3.7.1
Has someone found a solution yet?
-
I'm using pdf2txt.py -t xml to dump the coordinates of each character of a pdf.
Is there a way to get coordinates about words and lines (instead of individual characters)?
I tried with -A and -M -L…
-
I am using pdfminer's pdf2txt.py to extract text from different pdf's. The algorithm works very well in a lot of scenarios, but I am getting this error and I'm not sure what I can do to get pdfminer t…
-
- A description of the bug
Trying to extract images from a one page pdf, I found a key Error. The file is readable by pdf viewer like Okular or Evince
- Steps to reproduce the bug.
The command I …
-
Platform:Win10,Python3.7.0;
I tried use
**pdf2txt.py samples/simple1.pdf**
,but it open a .py file and no result.
-
pdf2txt prints that -F boxes_flow is an option but it is not documented on the web page or in the manual.
-
when I type
pdf2txt.py C:\gropid\input\Attention.pdf -o output C:\gropid\output\
I got no result I am missing something?
-
Using the command:
python pdf2txt.py -t xml -A
produces a verifiable error in bounding boxes.
(Please email me for the pdf)