Open GoogleCodeExporter opened 8 years ago
Patch:
Index: pdfsizeopt.py
===================================================================
--- pdfsizeopt.py (revision 102)
+++ pdfsizeopt.py (working copy)
@@ -3284,7 +3284,7 @@
trailer_obj.Set('Compress', None) # emitted by Multivalent.jar
# Emitted by Multivalent.jar etc., see section 10.3 in
# pdf_reference_1-7.pdf .
- trailer_obj.Set('ID', None)
+ # trailer_obj.Set('ID', None)
assert trailer_obj.head.startswith('<<')
assert trailer_obj.head.endswith('>>')
output.append('trailer\n%s\n' % trailer_obj.head)
@@ -5777,7 +5777,7 @@
# Please note that we save the space of the removed /ID and /Compress
# below, because /Type/XRef is usually the last object, so we don't
# need to add padding.
- pdf_obj.Set('ID', None)
+ # pdf_obj.Set('ID', None)
pdf_obj.Set('Compress', None)
if pdf_obj.Get('Index') != None:
raise NotImplementedError('unexpected /Index in xref object')
Original comment by lev.bishop
on 1 Nov 2009 at 5:18
Thank you for the bug report and the patch.
pdfsizeopt.py doesn't strive for PDF/A compliance. But if all you need is the
/ID,
please add a command-line flag that enables keeping the ID, turned off by
default.
Original comment by pts...@gmail.com
on 15 Nov 2009 at 9:03
In addition to the /ID, it PDF/A requires 1.4 or lower. Therefore, the -old
option
should be passed to tool.pdf.Compress. However this causes problems that I
don't yet
understand, so I am still investigating this.
Original comment by lev.bishop
on 25 Jan 2010 at 4:57
It would be nice to add PDF/A compatibility to pdfsizeopt's output -- provided
that its input PDF is also compliant to PDF/A, and the user explicitly asks for
PDF/A output by specifying a command-line flag. However, I definitely don't
want it enabled by default, because it increases the file size.
I'm not starting to add this feature alone. If you'd like to contribute, please
attach some (preferably tiny) example PDFs to this bug, for which pdfsizeopt.py
currently doesn't produce PDF/A. I'm closing this bug until you reply.
Do you have a software which checks for PDF/A compatibility? Is there free
software for that?
Original comment by pts...@gmail.com
on 11 Feb 2011 at 2:05
Thanks for considering this. I would be glad to work with you on getting it
working. I attach a small file that verifies as PDF/A-1b (using the Acrobat
9.4.1 preflight tool), the result of running pdfsizeopt --use-multivalent=false
on this, and the resulting PDF/A-1b conformance failure report from Acrobat.
The problems are:
1) ID in file trailer missing or incomplete
2) Syntax problem: Stream dictionary improperly formatted
3) Syntax problem: Stream dictionary has improper length entry
4) Syntax problem: Indirect object “endobj” keyword not preceded by an EOL
marker
5) Indirect object “endobj” keyword not followed by an EOL marker
As I said in the previous comment, with --use-multivalent=true it would be
necessary to give the -old option to multivalent, but that breaks other parts
of pdfsizeopt.py. Perhaps in the first place it would be enough to support only
-use-multivalent=false for PDF/A.
I have Acrobat Pro 9.4.1 so I can certainly verify any fixes you implement. I'm
not aware of any free conformance tools, but I can't say that I've looked very
hard.
Original comment by lev.bishop
on 11 Feb 2011 at 3:26
Attachments:
Sorry, here's the pdfsizeopt output that I forgot to attach
Original comment by lev.bishop
on 11 Feb 2011 at 3:28
Attachments:
Cool, thanks for the details.
I'm happy to make changes to pdfsizeopt.py so that Acrobat preflight won't
complain. But since I don't have that software, the most straightforward way is
that we prepare test input and output file.
I'll implement solutions to complaints 1) ... 5). Stay tuned for an update to
this bug.
I'll add support to pdfsizeopt.py for generating xref streams, no matter if
Multivalent is used.
I'll make sure that pdfsizeopt won't use %PDF-1.5 features, and it would fail
if the input is newer than %PDF-1.4.
I'll to figure out what kind of an /ID should be added if there was none.
I'll also patch pdfsizeopt.py so that it accepts the output of Multivalent
tool.pdf.Compress -old.
Original comment by pts...@gmail.com
on 11 Feb 2011 at 4:41
Its probably not necessary to add an /ID if there was none, since this would
mean that the input already did not conform to PDF/A.
Original comment by lev.bishop
on 11 Feb 2011 at 4:51
> Its probably not necessary to add an /ID if there was none, since this would
mean that the input already did not conform to PDF/A.
You are correct that it's not necessary. But I'd do so anyway, because it's
just a simple modification to pdfsizeopt.py, and can be helpful just in case.
Original comment by pts...@gmail.com
on 11 Feb 2011 at 4:53
Could you please try if Acrobat preflight accepts /ID[()()] in the trailer
without complaining? What about /ID[(A)(A)]?
Original comment by pts...@gmail.com
on 11 Feb 2011 at 5:14
Sorry it took me a while to figure out how to do this.
/ID[()()] : not accepted
/ID[(A)(A)] : accepted
Original comment by lev.bishop
on 16 Feb 2011 at 8:04
Issue 38 has been merged into this issue.
Original comment by pts...@gmail.com
on 4 Mar 2011 at 1:43
Original issue reported on code.google.com by
lev.bishop
on 1 Nov 2009 at 3:32Attachments: