pnggroup / libpng

LIBPNG: Portable Network Graphics support, official libpng repository
http://libpng.sf.net
Other
1.29k stars 626 forks source link

Questions about choosing zlib over other compression algorithms and other issues #555

Open IlluminatiWave opened 6 months ago

IlluminatiWave commented 6 months ago

I know the title is too... Dense. But I wanted to ask. What is the reason why other compression algorithms (like LZMA2) have not been considered yet?

A few days ago I "discovered" that this algorithm compresses better than zlib, I got it by recompressing the RGB matrix of an image (converting the pixels of a PNG into its byte representation).

Thanks to that I got surprising results (actually my goal was another one). Although I'm currently suffering a lot to be able to read the chunks (the documentation on the page is too technical, although that's not really a problem, it would be nice if they added real examples specifically in the chunk section, since even after reading the documentation, I can't generate a PLTA chunk correctly).

At first my goal was to create a variational image format (i.e. have the same image, but with chunks for specific changes, e.g. have 2 image of the same beach, but in one there is an umbrella and in the other not, so my idea was to just save a chunk of the umbrella and in a special viewer, when opening the image in an image viewer, it would appear as an image with multiple images inside itself (multipage tiff style, but combining the paradigm of apng optimization and its chunk saving system). The thing is that at the beginning it optimized a little, so I came up with a simple idea. Just save the RGB byte array.

So I did a quick test, I converted the byte array into a sort of raw file, with no metadata or weird headers. Just an rgba array.

The compression was interesting, as having used 7z with LZMA2, it gave some compression results that surprised me... And I was quite surprised because what I did was to convert many images in their raw representations (which went from weighing 200 mb to 600 mb) to a simple 7z file of just under 40 mb. That's without using a chunk system (i.e. I took the images in their actual size).

I also discovered that there is a format that does more or less what I was looking for (mng). The problem I found was that it has been in oblivion for 22 years, and as its documentation is scarce, there are not many examples on the internet and the ones there are are focused on animation (apng). So I discarded it (mng can be the alternative to multipage tiff, but I would have to lose the animation support and delegate it to apng).

I am currently optimizing the container to work with PNG metadata, but as I said at the beginning, the documentation could be better with real examples/chunks, since I am having a hard time understanding the structures of some not so used chunks, as well as having the possibility of using other compression algorithms in the PNG itself.

jbowler commented 5 months ago

PNG is designed to accommodate changes to the "compression" algorithm, along with the "filtering" algorithm (which, in reality, as the same thing made more complex). Just Do It!

jbowler commented 1 month ago

@IlluminatiWave how did your experiments go? I'd like you to document them here.

IlluminatiWave commented 1 month ago

I compressed the raw pixels with 7z (create a binary with the raw pixels).

The compression was higher with 7z (the binary was much heavier since it doesn't have the zlib compression).

Some time after experimenting, I tried to create a png without any compression, which the size was bigger, but when compressing using 7z, the compression was bigger than using oxipng at its maximum power (i.e. zlib).

I also know that apng uses lzma compression (7z), using apngasm with a png image (not animated) compresses slightly better than zlib, but not as much as the 7z program (I guess apngasm has to do something).

Compression is much higher if you use a set of uncompressed png images.

jbowler commented 1 month ago

APNG uses deflate, same as PNG. deflate is an implementation of LZ77. 7z isn't a compression algorithm, it's a file format (like PNG) and, like PNG, it can accommodate different formats. LZMA2 just seems to be a particular implementation of LZMA, just like deflate is a particular implementation of LZ77.

The elephant in the shadows here, of course, is LZW; it's not dictionary based so it has much lower overheads for compression and the classic LZW implementation is not tied to any particular bit size (though the table gets a bit big above 12 bits or so). The weakness of PNG is that it compresses up to four separate uncorrelated streams (RGBA) encoding 1, 2, 4, 8 and 16 bit data items. I don't believe LZ77 or LZMA are tied to a particular bit size, like LZW, but the deflate implementation of LZ77 certainly is. Oops.

That should be food for thought!

IlluminatiWave commented 1 month ago

I had read something about it, and you are right, actually it uses deflate, zlib is the format although actually the compression algorithm used is deflate, regarding 7z, you are also right, although I was referring to the 7z PROGRAM with the lzma2 algorithm.

Apngasm gui (at least the one I found), for some reason calls 7z to the lzma compression, although we always refer to the lzma or deflate algorithms for this case, I made a small script that extracted the rgba pixels without compression (without deflate) to later recompress using the 7z program (the binary generated from the rgba pixels without deflate), which generated a 7z file containing a raw/bin file inside, then I decided to create a copy of the png without deflate, which gives better results (avoid strange handling of the chunks).

https://apngasm.sourceforge.net/

jbowler commented 1 month ago

PNG (and APNG) modify the input to deflate by by the filtering algorithms. Unfortunately the filtering works on bytes and this has various strange effects when the RGBA components are not exactly 8-bits. Generally I've turned it off for 1, 2, 4 and 16-bit components.

My first approach is the same as yours; dump the uncompressed image data as a byte stream and see how that can be compressed. However compressing RGB data that way is fundamentally flawed because the R, G and B channels duplicate (triplicate :-) the luminance information. The standard CIE approach is to use what is effectively:

(R/(R+G+B), G/(R+G+B), G)

This is what CIE xyY is, CIE Yuv is better (u and v are perceptually linear measures of trichromat colour). The result is that more bits can be used to store "G" than the other two ("r" and "g") components. Conventional wisdom is that no more than 6 bits are required for u and v (in the CIE Yuv format), and no more than 9 for Y for a standard image (24 bit RGB is inadequate for standard images). 12 bit Y covers the full range of human perception (the colour components do not change); more than what are being termed "HDR" images by the industry.

The trick to compressing this is to compress the u and v channels independently of the Y channel and the bottom line is to first get the data in the right format before even trying to compress it :-)