pts / pdfsizeopt

PDF file size optimizer
GNU General Public License v2.0
750 stars 65 forks source link

Prince 13-produced PDF has all links removed with obj missing warning #145

Closed domenic closed 1 year ago

domenic commented 3 years ago

When I run print-prince13-unoptimized.pdf through pdfsizeopt (with default options), I get out print-prince13-optimized-broken.pdf, plus hundreds of warnings of the form

warning: obj 83823 missing, referenced by objs [968]...
warning: obj 73413 missing, referenced by objs [854]...
warning: obj 19294 missing, referenced by objs [274]...

The broken PDF has all of its internal hyperlinks missing. Possibly other things are broken, but that's the most noticeable.

Also interesting: this PDF is produced using Prince from https://html.spec.whatwg.org/. When I use Prince 11 on that same site, I get print-prince11-unoptimized.pdf. pdfsizeopt has no problems optimizing that version; the result is print-prince11-optimized.pdf.

pts commented 1 year ago

Thank you for reporting this! It is caused by a PDF parsing bug in pdfsizeopt. The input file print-prince13-unoptimized.pdf is a hybrid-reference file, meaning that its trailer contains an /XRefStm entry. pdfsizeopt currently doesn't support such PDF files. I'm keeping this issue open to track adding support.

pts commented 1 year ago

Fixed in 7cbb4643b91682bbad2f70eac9220ae1d279285d.