cannot read object with index 3 1 obj

DenKn commented 2 years ago

Hi. After success compression I've got corrupted pdf file. But file is very simple. You can find input.pdf and output.pdf files.

host:~/pdfsizeopt# ./pdfsizeopt input.pdf output.pdf
info: This is pdfsizeopt ZIP rUNKNOWN size=69734.
info: prepending to PATH: /root/pdfsizeopt/pdfsizeopt_libexec
info: loading PDF from: input.pdf
info: loaded PDF of 64094 bytes
info: separated to 43 objs + xref + trailer
info: parsed 43 objs
info: eliminated 33 unused objs, depth=4
info: found 0 Type1 fonts loaded
info: found 0 Type1C fonts loaded
info: optimized 2 streams, kept 2 zip
warning: obj 3 missing, referenced by objs [7]...
info: compressed 0 streams, kept 0 of them uncompressed
info: saving PDF with 10 objs to: output.pdf
info: trying 3 jobs and using the smallest
info: generated object stream of 455 bytes in 8 objects (27%)
info: job original generated 1580 bytes (2%)
info: job xrefstm generated 1878 bytes (3%)
info: job nostm generated 2031 bytes (3%)
info: jobs result: original=1580 xrefstm=1878 nostm=2031
info: generated 1580 bytes (2%)

Any ideas? Thanks.

output.pdf input.pdf

zvezdochiot commented 2 years ago

Hi @DenKn .

Use cpdf or qpdf before and after using pdfsizeopt.

See also #119

DenKn commented 2 years ago

Thanks. qpdf solved my problem.

pts commented 1 year ago

Thank you for reporting this bug! The attached input.pdf file contains an object with index 3 1 obj, and pdfsizeopt is unable read it, beucase it supports only N 0 obj (with any positive N). This is indeed a bug in pdfsizeopt. Until the bug is fixed, you should preprocess the input file with qpdf.

pts commented 1 year ago

Fixed in 2b006992319a6a451438f03bc35a334033a3b1e2. It works now:

info: This is pdfsizeopt rUNKNOWN size=402206.
info: prepending to PATH: ./pdfsizeopt_libexec
info: loading PDF from: input156.pdf
info: loaded PDF of 64094 bytes
info: separated to 43 objs + xref + trailer
info: parsed 43 objs
info: eliminated 3 unused objs, depth=6
info: found 0 Type1 fonts loaded
info: found 0 Type1C fonts loaded
info: optimized 17 streams, kept 3 #orig, 14 zip
info: eliminated 14 duplicate objs
info: compressed 0 streams, kept 0 of them uncompressed
info: saving PDF with 26 objs to: input156.pso.pdf
info: generated object stream of 1133 bytes in 16 objects (17%)
info: generated 33600 bytes (52%)

pts / pdfsizeopt

cannot read object with index 3 1 obj #156