hefronmedia / pdfsizeopt

Automatically exported from code.google.com/p/pdfsizeopt
0 stars 0 forks source link

Missing trailer after optimization #42

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. python pdfsizeopt.py --use-pngout=false --use-jbig2=false soubor.pdf

What is the expected output? What do you see instead?

At the end of PDF should be trailer indicating what is the start point of the 
document and it is missing. When you use program working with IText as library 
for manipulating with PDF documents I get exception which indicates missing 
trailer (I confirmed it using vim editor).

Exception from the tool called pdfsign using IText is: 
com.itextpdf.text.exceptions.InvalidPdfException: trailer not found.
    at com.itextpdf.text.pdf.PdfReader.rebuildXref(PdfReader.java:1566)
    at com.itextpdf.text.pdf.PdfReader.readPdf(PdfReader.java:521)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:172)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:161)
    at cz.dml.pdfsign.PDFSigner.sign(PDFSigner.java:65)
    at cz.dml.pdfsign.Main.main(Main.java:51)

What version of the product are you using? On what operating system?

I am using the newest version in repository on Ubuntu 10.10

Please provide any additional information below.

item.pdf is file before running pdfsizeopt (it still has trailer)
item.pso.pdf is PDF file after running pdfsizeopt (trailer is missing)

Original issue reported on code.google.com by hata.ra...@gmail.com on 21 May 2011 at 7:53

Attachments:

GoogleCodeExporter commented 9 years ago
Thank you for composing this bug report.

The item.pso.pdf file you attached is correct PDF-1.5. It also works with 
evince, xpdf and gs.  A trailer is not required in PDF-1.5. If IText cannot 
read it, please report that as an IText bug: ``IText cannot read a PDF which 
contains a cross reference stream''.

You might want to use the following command to generate a PDF which IText can 
read:

  pdfsizeopt --do-generate-xref-stream=false --use-pngout=false --use-jbig2=false item.pdf

Original comment by pts...@gmail.com on 23 May 2011 at 8:41

GoogleCodeExporter commented 9 years ago
iText developers are not completely sure the pdfsizeopt-processed PDF file is 
valid: 
http://sourceforge.net/tracker/?func=detail&aid=3306273&group_id=15255&atid=1152
55

Original comment by ruzicka....@gmail.com on 23 May 2011 at 1:11

GoogleCodeExporter commented 9 years ago
This is a bug in whatever software created item.pso.pdf

Read section 7.3.8 of ISO-32000-1 on the subject of stream objects: There 
should be an end-of-line marker after the data and before endstream; this 
marker shall not be
included in the stream length.
An end-of-line marker consists of either a CARRIAGE RETURN and a LINE FEED or 
just a LINE FEED, and not by a CARRIAGE RETURN alone.
If you look at the xref stream of item.pso.pdf (and the other streams), you see 
that the end-of-line marker is missing: ÷+QN^²xendstream endobj

Original comment by bruno.lo...@gmail.com on 23 May 2011 at 1:17

GoogleCodeExporter commented 9 years ago
Thank you very much for the help, I will try to automatically repair the 
end-of-line marker which I expect to solve this problem.

Original comment by hata.ra...@gmail.com on 23 May 2011 at 3:32

GoogleCodeExporter commented 9 years ago
pdfsizeopt.py generates PDFs which conform to 
http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/pd
f_reference_1-7.pdf , which says in 3.2.7: It is recommended that there be an 
end-of-line marker after the data and before endstream; this marker is not 
included in the stream length.

It would be a nice new optional feature to make pdfsizeopt.py conform to 
ISO-32000-1 though.

Original comment by pts...@gmail.com on 23 May 2011 at 5:23