hefronmedia / pdfsizeopt

Automatically exported from code.google.com/p/pdfsizeopt
0 stars 0 forks source link

PdfTokenParseError when trying to optimize pdf #39

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The PDF is the book "Analytic Combinatorics", available from 
http://algo.inria.fr/flajolet/Publications/books.html

I used "./pdfsizeopt.py --use-pngout=true --use-jbig2=true 
--use-multivalent=true --do-unify-fonts=false" to optimize.

What does pdfsizeopt display when running the command above?

I get:

info: This is pdfsizeopt.py rUNKNOWN.
info: loading PDF from: book.pdf
info: loaded PDF of 12141468 bytes
info: separated to 3673 objs
Traceback (most recent call last):
  File "./pdfsizeopt.py", line 6157, in <module>
    main(sys.argv)
  File "./pdfsizeopt.py", line 6133, in main
    ).Load(file_name)
  File "./pdfsizeopt.py", line 3037, in Load
    do_ignore_generation_numbers=self.do_ignore_generation_numbers)
  File "./pdfsizeopt.py", line 344, in __init__
    (other[start : start + 32], file_ofs))
__main__.PdfTokenParseError: X Y obj expected, got 
'&nJ\xd2\xde\x12w\xfeFX?T\xd9\x06\x13\xd4\xdbf\xbe\xca\x80\x18\xe7\xb8k\xf7\\\xb
87\xda\xa7\x8c' at ofs=688

...and no output pdf is produced.

I checked out the sources, and my copy of pdfsizeopt is identical to the 
current checkout: $Id: pdfsizeopt.py 134 2009-11-29 11:48:12Z ptspts $

Original issue reported on code.google.com by dr.dan.drake@gmail.com on 30 Nov 2010 at 2:08

GoogleCodeExporter commented 9 years ago
Sorry for the long time it took for me to respond.

Thank you for reporting this bug. It's fixed in r143.

Please note that book.pdf is invalid: its xref table contains the offset 688, 
but there is no valid object there. I made pdfsizeopt.py ignore such a problem 
(printing a warning instead).

Original comment by pts...@gmail.com on 10 Feb 2011 at 8:11