pts / pdfsizeopt

PDF file size optimizer
GNU General Public License v2.0
750 stars 65 forks source link

Invalid PDF token: '\x0b' #153

Open Cubba2412 opened 2 years ago

Cubba2412 commented 2 years ago

Hello

I have a simple scanned pdf receipt from a HP deskjet 3632 via the HP smart app and whenever i try to run pdfsizeopt on anything scanned from this machine I am presented with the following error message:

warning: cannot parse obj 2: pdfsizeopt.main.PdfTokenParseError: In obj data between ofs 9 and 3986441: Invalid PDF token: '\x0b'
warning: cannot parse obj 3: pdfsizeopt.main.PdfTokenParseError: syntax error in endobj/endstream

Furthermore the output pdf becomes errorneus and gives the following error when I try to open it in Adobe Acrobat:

There was an error opening this document. There was a problem reading this document (14)

When I open it in a browser it is simply completely empty.

What does the error stem from and how can I make my scanned pdf's work with pdfsizeopt?

zvezdochiot commented 2 years ago

What does the error stem from and how can I make my scanned pdf's work with pdfsizeopt?

Use pdfsizeopt with cpdf(https://github.com/johnwhitington/cpdf-source) or qpdf(https://github.com/qpdf/qpdf).

See also:

Keks-Dose commented 2 years ago

@Cubba2412 Comments by zvezdochiot were not helpfull at all in my case. He / she has no clue.

pts commented 1 year ago

It is very unusual for a PDF to have the character \x0b (ASCII 11) outside strings and stream data, but it's allowed: according to section 3.1.1 of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf, \x0b is a regular character, and can be part of a token. So pdfsizeopt rejecting it is probably a bug in pdfsizeopt.

Could you please upload one of your input PDFs?

pts commented 1 year ago

As suggested by @zvezdochiot, it is also my gut feeling that running cpdf or qpdf as a workaround before pdfsizeopt may fix this problem. However, we can't possibly know for sure until @Cubba2412 shares one of the input PDFs.

pts commented 1 year ago

@Cubba2412: Can you please upload a sample PDF which demonstrates this bug?