Optimize JPEG images within the PDF

GoogleCodeExporter commented 9 years ago

** These produce bytewise identical, but smaller files:
   * jpegoptim -t --strip-all *.jpg
   * imgopt *.jpg
   * for F in *.jpg; do
     jpegtran -copy none -optimize -outfile jo.bin "$F" &&
     mv -f jo.bin "$F"; done
   ... but make sure to run jfifremove (part of imgopt, simple code, unused)
   etc. first for removing JFIF, EXIF etc. metadata; `-copy none' also
   removes some stuff

See more about this in trunk/info.txt .

Original issue reported on code.google.com by pts...@gmail.com on 7 Aug 2012 at 4:00

GoogleCodeExporter commented 9 years ago

jhead -purejpg *.jpg (after jpegtran) leaves the file as pure jpeg, AFAIK. (If 
I am incorrect, please correct me).

And both of these are already packaged in Debian and Ubuntu.

Since you know the inner workings of PDF files, would if one chooses a high 
enough version of PDF files, would using jpeg2000 be a simple drop-in 
replacement to jpeg, or are there other parts/structure of the documents that 
would need to be changed?

Original comment by rbr...@gmail.com on 27 Feb 2013 at 11:22

GoogleCodeExporter commented 9 years ago

Thank you for posting these commands for JPEG optimization. It would be a 
useful feature to add to pdfizeopt. I also have some similar commands somewhere 
to try.

About JPEG2000: it's a completely different lossy image compression algorithm. 
It's a design principle of pdfsizeopt that it can't cause visual quality loss. 
So converting JPEG images to JPEG2000 won't be added. (We can relax this 
requirement in the future if there are volunteers for adding support: they can 
add a command-line flag disabled by default.)

Original comment by pts...@gmail.com on 27 Feb 2013 at 1:41

GoogleCodeExporter commented 9 years ago

Thanks for the comment.

Regarding JPEG2000, I think that it is, indeed, better to keep with the current 
philosophy of being "as lossless as we can" and just scrap that idea of mine.

Original comment by rbr...@gmail.com on 28 Feb 2013 at 3:27

GoogleCodeExporter commented 9 years ago

I suggest running jpegtran (at least) twice to create two optimized jpeg files, 
one with "-progressive", one without, and choosing the smaller one. 

Usually, especially for photos, the progressive version is smaller. For simpler 
and smaller images, sometimes the non-progressive version is smaller.

Jpegtran is superfast, so multiple runs will hardly affect compression time.

To optimize filesize further, the sequence of the progressive order can be 
changed by the jpegtran-parameter "-scans". The default scanfile, included into 
jpegtran, is optimized for progressive images on the web but not for filesize.

To keep the speed high, a number of fixed scanfiles, optimized for filesize, 
can be used to create different jpeg-files. Then just take the smallest one. I 
suggest using the scanfiles from this file: 
http://akuvian.org/src/jpgcrush.tar.gz

If speed is not first priority, a scanfile can be created for each image by 
finding a good combination using a tool called jpegrescan.

Typical example:
original jpeg: 770138 bytes (not optimized, 61 byte metadata)
jpegtran -copy none -optimize in.jpg out.jpg : 623913 bytes
jpegtran -copy none -optimize -progressive in.jpg out.jpg : 618403 bytes
jpegtran -copy none -optimize -progressive -scans jpeg_scan_rgb.txt in.jpg 
out.jpg : 606477 bytes
(third example needs "jpeg_scan_rgb.txt" from 
http://akuvian.org/src/jpgcrush.tar.gz

Original comment by Sebastia...@googlemail.com on 12 Sep 2013 at 2:24

GoogleCodeExporter commented 9 years ago

@SebastianWilke78: Thank you for the insights, links and great ideas!

Now I only need some free time to implement this. I also accept patches.

Original comment by pts...@gmail.com on 12 Sep 2013 at 5:24

sudharakab / pdfsizeopt

Optimize JPEG images within the PDF #68