schnaader / precomp-cpp

Precomp, C++ version - further compress already compressed files
http://schnaader.info/precomp.php
Apache License 2.0
404 stars 52 forks source link

Using FLIF for image compression #44

Open schnaader opened 8 years ago

schnaader commented 8 years ago

Image data is present in many of the streams Precomp supports, e.g. PDF, PNG and GIF. On this data, specialized image compression algorithms can be used to get better compression ratio than with the general purpose compression methods the -c option offers (bzip/lzma).

At the moment, the -pdfbmp option does something similar by converting image data detected in PDF documents to BMP format so higher compression ratios can be achieved when processing the resulting files with compressors from the PAQ family later.

Compressing the image data using the FLIF library would offer a way to get higher compression ratios on image data using only Precomp. This would also be useful for people that want to convert existing image files to FLIF without losing the original file.

M-Gonzalo commented 4 years ago

FLIF development has stopped since FLIF is superceded by FUIF and then again by JPEG XL (https://jpeg.org/jpegxl/index.html), which is based on a combination of Pik and FUIF. A royalty-free and open source reference implementation of JPEG XL is available on GitLab.

JPEG XL also includes a complete instance of brunsli to do bit-lossless compression of JPEGs...

M-Gonzalo commented 4 years ago

Another thing: correct me if I'm wrong, but, wouldn't this issue require some kind of bit-lossless conversion from the filtered data as is used per each format to some intermediate bitmap type, like tif or pnm? AFAIK there is no such thing yet. There are GIF and PNG pixel-lossless compressors but that's it. I don't even know if it is possible to guess and reproduce the exact way any given gif or png was filtered before lzw or deflate compression...

Even if it were possible, there is an alternative to FLIF or JPEG XL that would make more sense in some cases, like files with lots of tiny bitmaps very similar to each other. It is to use a simple filter to improve further compression with whatever comes after precomp (or the internal -cb -cl). It's faster and doesn't force individual compression for each stream.

See this, for an example on a folder with PNMs:

COMPRESSOR RATIO GAIN TRANSFORM
  40,55 % 10,84 % paq8pxd89 -s0
  43,79 % 7,60 % mfilter
  44,16 % 7,23 % fazip mm
bcm 51,39 % - -
       
  50,99 % 12,37 % paq8pxd89 -s0
  51,09 % 12,27 % mfilter
  52,12 % 11,24 % fazip mm
7z 63,36 % - -
       
  50,99 % 12,46 % paq8pxd89 -s0
  51,14 % 12,31 % mfilter
  52,17 % 11,28 % fazip mm
precomp 63,45 % - -
schnaader commented 3 years ago

Another thing: correct me if I'm wrong, but, wouldn't this issue require some kind of bit-lossless conversion from the filtered data as is used per each format to some intermediate bitmap type, like tif or pnm? AFAIK there is no such thing yet. There are GIF and PNG pixel-lossless compressors but that's it. I don't even know if it is possible to guess and reproduce the exact way any given gif or png was filtered before lzw or deflate compression...

Yes, but luckily defiltering PNG is much easier as brute-forcing the filters as the filter type applied to each scanline is stored in the compressed PNG, so the filtering can be undone and redone easily. Both paq8px and paq8pxd do this.

Even if it were possible, there is an alternative to FLIF or JPEG XL that would make more sense in some cases, like files with lots of tiny bitmaps very similar to each other. It is to use a simple filter to improve further compression with whatever comes after precomp (or the internal -cb -cl). It's faster and doesn't force individual compression for each stream.

See this, for an example on a folder with PNMs:

Interesting, I'll have a look at which transforms paq8pxd does there. I know of at least one that is used for 24- and 32-bit image data: Transforming (R,G,B) to (G, G-R, G-B). Might give a quick first step on that road, thanks!

M-Gonzalo commented 3 years ago

Didn't know that. Thanks for the info! Gotta love recompression-friendly formats

Another thing that might help then is to apply those reversible png transforms to other uncompressed bitmap formats. After all, they are designed to improve compression on a lz77 algorithm such as deflate. I mean creating a png sanz zlib instead of whatever paq8pxd or mfilter do. Might be even easier using something like libpng... Am I making sense here?

M-Gonzalo commented 3 years ago

I made a few more tests on real-world images, extracted from PDF files. Some considerations:

Size Ratio Method
56526836 6,63 % *.flif
134735072 15,81 % PAMs.paq8pxd89.srep.fxz
852046537 100,00 % Original

Old JPEGs are encoded bit-lossless ("mathematically-lossless") by using brunsli, but I couldn't find out if the same is true for other image formats, or if "lossless" here means pixel-identical.

schnaader commented 3 years ago

Another thing that might help then is to apply those reversible png transforms to other uncompressed bitmap formats. After all, they are designed to improve compression on a lz77 algorithm such as deflate. I mean creating a png sanz zlib instead of whatever paq8pxd or mfilter do. Might be even easier using something like libpng... Am I making sense here?

I'll definitely have a look at existing PNG libraries to avoid reinventing the wheel, yes. Applying the PNG transform sometimes helps compression, sometimes hurts, depending on the compression algorithm and the image data, so one possible way would be to just compare the compression result of the unfiltered image data and the original PNG-filtered image data and use the one with the smaller result.

  • FLIF is not bit-lossless. Out of a total of 3283 images I processed, 100% of them were different after extraction.
  • It may be possible to recreate the originals using some kind of diff. Using a good compressor as a poor man's bsdiff shows that there is a lot of shared data between original and extracted files, so maybe a content-aware diff could theoretically help produce a reversible compression with a very small overhead.

This most likely is caused by the PAM header at the start of the file, in my tests, there were two possible headers ("P6" and "P7"), but the image data was the same. All pixmap (PGM/PPM/PAM/...) headers will have to be parsed and at least partially stored anyway because e.g. they also can contain comments.

Regarding JPEG XL, I made some tests with it and it seems at a good option for both very fast (de)compression and best ratio for photographic images. It also makes good use of multithreading. On the other hand, it consumes much memory even on faster levels (2 to 8 GB for the bigger images in the list below). Precomp will most likely integrate multiple image compression formats anyway (at least FLIF, webp and JPEG XL) because each one performs different on different image types - and for some image types like screenshots with text or images with a lot of repetition, even pure Precomp (LZMA2) will be better than most of them:

image image type / metric Precomp FLIF -e FLIF -e -N cwebp -z 8 cwebp -z 9 -mt cjpegxl -d 0 -s 7 cjpegxl -d 0 -s 9
big_building photo
size 78,406,988 50,061,146 50,513,586 53,920,370 51,774,478 48,001,876 Out of memory
encode time 14 s 2 min 39 s 2 min 20 s 44 s 1 min 42 s 6 min 48 s
encode mt factor 1.75 3.95
decode time 17 s 17 s 800 ms 900 ms 1500 ms
decode mt factor 8.79
artificial rendering
size 1,140,810 1,089,892 941,769 1,017,688 1,014,952 927,746 808,609
encode time 3 s 12 s 9 s 4 s 9 s 35 s 10 min 52 s
encode mt factor 1.96 3.82 5.60
decode time 1600 ms 1600 ms 70 ms 70 ms 200 ms 200 ms
decode mt factor 6.65 6.18
flower_foveon photo
size 2,025,125 2,072,475 2,449,486 2,440,190 2,175,285 1,993,550
encode time 7 s 6 s 3 s 7 s 31 s 9 min 4 s
encode mt factor 1.93 4.28 5.39
decode time 90 ms 200 ms
decode mt factor 6.70
Ubuntu Software Centere screenshot
size 171,230 222,750 193,334 208,868 136,528 224,605 193,642
encode time 700 ms 3 s 2 s 1 s 3 s 10 s 3 min 33 s
encode mt factor 1.95 5.19 6.96
decode time 300 ms 400 ms 30 ms 20 ms 100 ms 100 ms
decode mt factor 4.43 5.23

Note: "mt factor" gives a multiplicator on how much time the process would take when using only one CPU core. For all times without a given mt factor, the process was single-threaded and didn't make use of the other cores.

Images: big_building,.ppm 117,158,993 Bytes, 7216 x 5412 px artificial.ppm, 18,874,385 Bytes, 3072 x 2048 px flower_foveon.ppm, 10,287,665 Bytes, 2268 x 1512 px Ubuntu-Software_Center.png, 224,470 Bytes, 1202 x 829 px

Sources: http://imagecompression.info/ https://commons.wikimedia.org/wiki/File:Ubuntu-Software_Center.png

Test machine: AMD Ryzen 5 4600H, 6 x 3.00 GHz 16 GB RAM Ubuntu 20.04

M-Gonzalo commented 3 years ago

That's very interesting... Especially the speeds. It's my opinion that precomp shouldn't include codecs that are much slower than pure lzma2, even if the gains are good. Preprocessing is a very slow task already. Maybe something like cwebp -z 8 would be a good sweet spot, or perhaps wait until JPEG-XL reference encoder is complete and optimized.

About the PAM images, I believe the difference is more than just the headers. It's difficult to make an exact measure but if the difference was only on the headers I'd expect 7z to compress originals and recreated images to just about the same ratio but they are very different.

Size Ratio Method cspeed
915834740 100,00 % Originals (6713 files)
911266294 99,50 % Reconstructed by flif
113685340 12,41 % Originals.rep:1g+fxz 10.9 Mbytes/sec
73729664 8,05 % Reconstructed.rep:1g+fxz 18.1 Mbytes/sec

See the difference in speed, also. EDIT: I found the difference: It seems that mutool extracted all pdf images as a CMYK TUPLTYPE, and that is somehow less compressible than the RGB_ALPHA mode used by flif to decompress.

It makes me wonder if that specification used by flif to output the images can't be used as a cheap preprocessor in and on itself, should it be reversible...

Question: are webp and jpeg-xl bit-lossless, or just pixel-identical like flif?

schnaader commented 3 years ago

That's very interesting... Especially the speeds. It's my opinion that precomp shouldn't include codecs that are much slower than pure lzma2, even if the gains are good. Preprocessing is a very slow task already. Maybe something like cwebp -z 8 would be a good sweet spot, or perhaps wait until JPEG-XL reference encoder is complete and optimized.

Yes, as a default I'd also tend to choose one of the faster codecs (webp and jpegxl), even on lower (faster) levels - for most images, they will give better ratios than lzma2. But there will also be a mode for maximum compression that tries different codecs with slower settings. Also note that reconstructing is very fast for webp and jpegxl even if the compression took ages.

EDIT: I found the difference: It seems that mutool extracted all pdf images as a CMYK TUPLTYPE, and that is somehow less compressible than the RGB_ALPHA mode used by flif to decompress.

Ah, good work, didn't know about that variant.

Question: are webp and jpeg-xl bit-lossless, or just pixel-identical like flif?

All of them are just pixel-identical, their lossless modes translate to "restore the image data losslessly". The image header/metadata after a roundtrip can be identical when there's not much ambiguity in the format (e.g. PGM doesn't offer much variety, uncompressed BMP or TIFF also has a high chance to get identical), but for input/output formats like PNG files it's very unlikely that files were identical after a roundtrip.

M-Gonzalo commented 3 years ago

Cool! Personally, I wouldn't mind a visually lossless mode on precomp. Granted, it shouldn't be the default because it can break lots of things, but for stuff like personal documents is great.

Even PDFs can be treated like this, for example, re-saving them uncompressed instead of passing them through preflate. This has the advantage of undoing all the other compressions too, for example, rle or lzw.

The same applies for zip, 7z, xz, etc. As long as nobody expect them to have a very specific size and hash, I don't mind to just reconstruct them without caring about bit-losslessness.

This wouldn't only work as a giant speed-up, it'd also allow to include a lot of previously incompatible stream types. Basically anything that can be reconstructed, including but not limited to zstd, lz4, lzx, lzma, lzma2, ppmd, lzw, etc. And of course everything on the lower recursion levels.

Look at this little experiment to see just how much can a single extra algorithm (SZDD in this case) improve overall compression when it allows the final compressor to see through the lower levels:

Size Ratio To baseline Method
158.633.023 9,99 % 50,62 % Paq8pxd + razor
313.357.824 19,73 % 100,00 % Razor
313.390.310 19,73 % 100,01 % Precomp + razor
1.588.301.514 100,00 %   Original

Paq8pxd = v89 -s0   Precomp = -e -cn -intense -brute   Original = "Drivers" folder, an old fix-it-all for previous versions of Windows.