lebedov / python-pdfbox

Python interface to Apache PDFBox command-line tools.
Other
75 stars 24 forks source link

extract_text returns None #13

Closed ecatkins closed 4 years ago

ecatkins commented 4 years ago
OS: MacOS Mojave
Python: 3.7.4
PDFBox: 2.0.16
python-pdfbox: 0.1.7

After upgrading from 0.1.6 -> 0.1.7, when I run extract_text method as below, it returns None whereas previously returning the correct text.

import pdfbox
p = pdfbox.PDFBox()
text = p.extract_text('/path/to/my_file.pdf')
jgiere commented 4 years ago

OS: Ubuntu 18.04 Python: 3.6.8 PDFBox: 2.0.16 python-pdfbox: 0.1.7

I can confirm this behaviour on my environment as well.

kuraga commented 4 years ago

@lebedov , confirm.

lebedov commented 4 years ago

Confirmed. The wrappers now conform more closely to the wrapped tool's interface and only write to output files.

ecatkins commented 4 years ago

@lebedov So the usage needs to be changed in the README, yes?

lebedov commented 4 years ago

@ecatkins Yes - fixed.