libjpeg-turbo / libjpeg-turbo

Main libjpeg-turbo repository
https://libjpeg-turbo.org
Other
3.75k stars 1.02k forks source link

Perfectly reconstructing a jpeg from a huge lossless re-encoding of it #327

Closed chipweinberger closed 5 years ago

chipweinberger commented 5 years ago

Hi,

I spent 80+ hours digitizing all my family photos. Unfortunately the scanning software I use converted a bunch of jpegs to lossless tiffs then deleted them. And now the library takes up 10x more space.

Is there any way to perfectly reconstruct the original jpegs?

dcommander commented 5 years ago

Unfortunately not. JPEG is a lossy algorithm, and the loss is incurred during compression (specifically during quantization of the DCT coefficients, which effectively throws away some of the highest frequency components in the image-- the assumption being that photographic images don't generally have many of those components, so they won't generally be missed.) The only way to transcode a JPEG image into another JPEG image without generational loss is by performing entropy decompression (the lossless part of JPEG compression) on the original image and copying the raw DCT coefficients. (That's what the jpegtran program in libjpeg-turbo does, incidentally.) Once a JPEG image has been decompressed, there is no way to transcode the decompressed image into another JPEG image without incurring generational loss. You can minimize that generational loss by re-compressing with quality 100 and no subsampling (4:4:4), which means that the only loss in the compression algorithm will be due to round-off error. Of course, however, using 4:4:4 subsampling and quality=100 will produce files that are a lot larger than the original JPEG files. That is as close as you will be able to get to producing JPEG images with the original amount of image loss. The next best thing is to re-compress the images using the exact same quality and subsampling level as in the original images. If you don't know what that is, then scan another picture as JPEG using the same settings you originally used and use the identify program in ImageMagick to detect the quality and subsampling that your scanner program is using. In most cases, one generation of JPEG loss in photographic images should be below the threshold of human perception as long as you use the same or better quality/subsampling settings. You might also look into PNG, since it has more advanced lossless compression algorithms than TIFF and (depending on the PNG settings) may be able to significantly reduce the disk footprint of your library without resorting to lossy re-compression.

To summarize: these are your options (in order of increasing generational loss)

  1. Re-compress using PNG
  2. Re-compress using JPEG with quality=100 and no subsampling (4:4:4)
  3. Re-compress using JPEG with the same quality/subsampling that were used to generate the original images
chipweinberger commented 5 years ago

Amazing answer. I really appreciate that. I'll go down the recompression route at same or better quality and subsampling level that you suggest =)

chipweinberger commented 5 years ago

I am curious, does round off error come from using floats in the compression software? I wonder if it would be possible to use rational numbers only (theoretically).

Also curious what sort of theoretical "brute force" methods there are- at the block compression level - to try all the (reasonable) combinations of jpeg until you compute a jpeg input that gets the exact correct output, and then continue until you find the smallest jpeg that satisfies that condition.

It's an interesting problem.

dcommander commented 5 years ago

I am curious, does round off error come from using floats in the compression software? I wonder if it would be possible to use rational numbers only (theoretically).

No, floats actually improve the situation very slightly, but at the expense of performance. To put numbers on this, the following research paper: https://www.researchgate.net/publication/262897371_Image_Quality_Assessment_Using_the_SSIM_and_the_Just_Noticeable_Difference_Paradigm concluded that a SSIM (structural similarity) of about 95 is the threshold of perceptual loss, i.e. the point at which most human observers will start to notice compression artifacts. That's equivalent to a DSSIM of about 0.01, which can be measured using this software: https://github.com/kornelski/dssim. Thus, I tend to want to see DSSIM values << 0.01 in order to trust that the compression was perceptually lossless. The difference between the floating point DCT/IDCT vs. the "slow integer" DCT/IDCT in libjpeg-turbo is a DSSIM of 0.0005, which is measurable but so far below the threshold of perceptual loss as to be insignificant. (NOTE: the name "slow integer" is a relic of the early 90s, during which that algorithm was significantly slower than the "fast integer" DCT/IDCT on the 286 and 386 stone knives and bear skins available at the time. In libjpeg-turbo, both algorithms are SIMD-accelerated and generally perform within 10% of each other.)

Also curious what sort of theoretical "brute force" methods there are- at the block compression level - to try all the (reasonable) combinations of jpeg until you compute a jpeg input that gets the exact correct output, and then continue until you find the smallest jpeg that satisfies that condition.

It's an interesting problem.

If I understand the question, I think this is sort of what NASA's old DCTune project did. It iterated on a specific image and computed a set of custom quantization tables for that image that would maximize compression ratio for a desired perceptual quality metric. The quantization tables used in libjpeg-turbo (those tables form the basis for the JPEG quality levels) are just examples, and the libjpeg API and cjpeg program allow you to specify your own. Also look at mozjpeg, which applies other techniques for increasing compression ratio at the expense of performance, although it's more about just improving compression at a given JPEG quality level. By using its own quantization tables, DCTune completely bypasses the JPEG quality levels. But, in both cases, the result is extremely slow compression-- like 50x slower than libjpeg-turbo, in the case of mozjpeg. The other cool thing you can do with the libjpeg API and cjpeg is to specify different JPEG quality levels for each component, so you could achieve a subsampling-like effect without actually subsampling, by simply assigning a lower quality to chrominance and a higher quality to luminance. I haven't done any experiments with this to see how it compares to subsampling in terms of compression ratio and DSSIM.

dcommander commented 5 years ago

I think at least some (most?) of the round-off error occurs in color conversion, as a result of using 8-bit lookup tables to convert from RGB to YCbCr and range-limiting the result. At the expense of performance, it should theoretically be possible to convert directly to/from 16-bit or float YCbCr, which is what the DCT algorithm takes as input, but the libjpeg architecture isn't really designed to do that. However, when built with 12-bit JPEG support, it will do 16-bit-to-16-bit RGB-to-YCbCr color conversion and 32-bit DCT/IDCT (at the expense of no SIMD acceleration-- at least at the moment.) So compressing an 8-bit-per-component image using 12-bit JPEG is a somewhat extreme way of really limiting the round-off error, but not really worth it because of how big the files would be.

dcommander commented 5 years ago

Also, using RGB as the JPEG color space will further reduce round-off error relative to YCbCr, but since the default quantization tables are designed for luminance and chrominance, RGB JPEGs tend to have much worse compression ratios than their YCbCr counterparts, all else being equal.

MarcusJohnson91 commented 4 years ago

JPEG is not a solely lossy algorithm, there's a lossless mode in the original spec, I'm implementing it myself.