Closed RealDataLLC closed 4 years ago
Hate to be that guy but any update on this? Totally at a loss here. If it helps I'm running on Linux.
> pdftk "that.pdf" dump_data
WARNING: The creator of the input PDF:
that.pdf
has set an owner password (which is not required to handle this PDF).
You did not supply this password. Please respect any copyright.
InfoBegin
InfoKey: Creator
InfoValue: IDM
InfoBegin
InfoKey: CreationDate
InfoValue: D:20180607145130+02'00'
InfoBegin
InfoKey: Producer
InfoValue: PDFlib+PDI 7.0.2 (COM/Win32)
InfoBegin
InfoKey: Author
InfoValue: IntegraDM
PdfID0: 939f2420294646f31f041d74020f2c30
PdfID1: 939f2420294646f31f041d74020f2c30
NumberOfPages: 10
PageMediaBegin
PageMediaNumber: 1
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 2
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 3
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 4
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 5
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 6
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 7
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 8
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 9
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 10
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
and
> file that.pdf
that.pdf: PDF document, version 1.7
unfortunately I can not share the original pdf as it contains sensitive data but reading it works fine and https://github.com/jcushman/pdfquery reads and handles it just fine.
Same here. Please fix
Hi, here is a file that gives the same error "MGROS-2017Y.pdf only algorithm code 1 and 2 are supported"
I recognized that this is an issue of the dependancy PyPDF2 from 2015.
thanks for your feedback which prompted me to retry you are right @myleshk
Is the PDF encrypted? Can you try decrypting it using qpdf
and then try again?
@vinayak-mehta mine is not encrypted. And as I said pdfquery another Python library can read it just fine.
I understand. Looked at pdfquery, it looks nice! Interestingly, it also uses pdfminer under the hood. I'll look into this over the weekend.
Sorry for the late responses to issues.
Camelot does not support Acrobat files version 6 or higher. Convert your PDF file to a lower version (I used Acrobat 4.0 PDF 1.3) just through any converter online. The problem should be solved!
@alexxxkorolev thanks for the tip! Any suggestion for a command line tool, preferably Linux, that can downgrade PDFs? The problem is that I use camelot in an automated pipeline and can not manually convert PDFs.
https://github.com/mstamy2/PyPDF2/issues/378#issuecomment-689585779 using pikepdf, solved for me.
Having trouble running this code on my mac. Using Conda virtual env and installed using conda. Pdf is not password protected.
import camelot import pandas as pd import re import numpy as np table1 = camelot.read_pdf('IEEJ - 2019 - Outlook.pdf')
NotImplementedError Traceback (most recent call last)