atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.62k stars 350 forks source link

pdfminer.psparser.PSSyntaxError #341

Closed linchart closed 5 years ago

linchart commented 5 years ago

a bug appear when i use camelot to read this pdf file error message: pdfminer.psparser.PSSyntaxError: Invalid dictionary construct: [/'Type', /'Font', /'Subtype', /'Type0', /'BaseFont', /b"b'", /"ABCDEE+\xcb\xce\xcc\xe5'", /'Encoding', /'Identity-H', /'DescendantFonts', , /'ToUnicode', ] is anyone kown how to fix it?

anakin87 commented 5 years ago

It seems that there is an error in the PDF structure. You can try to repair the PDF using qpdf (http://qpdf.sourceforge.net/)

linchart commented 5 years ago

It seems that there is an error in the PDF structure. You can try to repair the PDF using qpdf (http://qpdf.sourceforge.net/)

thanks reply, i will try it, but it worked when i use pdfplumber to open this pdf

vinayak-mehta commented 5 years ago

qpdf should be able to fix your problem.