Closed FANGOD closed 2 years ago
Hi @FANGOD, and thanks for your interest in this library. I can confirm that the PDF also processes fine on Mac. Not sure what would be causing the issue w/ Linux but, in any case, the stacktrace seems to indicate that the error stems from pdfminer
(the lower-level library we use to extract the PDF's structural information and objects) rather than pdfplumber
. For that reason, I'm closing this issue. If you'd like, however, you can open an issue in the pdfminer
repository; I would recommend pasting or attaching a minimal Python script that fully reproduces the problem. Something like:
import pdfminer
from pdfminer.high_level import extract_text
print(f"pdfminer version {pdfminer.__version__}")
extract_text('small-Bitdefender-Whitepaper-Virt-CIO-A4-en-EN-screen-compressed.pdf')
... assuming that this does reproduce the problem on Linux.
pdfplumber 0.6.0
pdf:
https://www.bitdefender.com/content/dam/bitdefender/business/whitepapers/pdf/small-Bitdefender-Whitepaper-Virt-CIO-A4-en-EN-screen-compressed.pdf
Works fine on windows .