Add modified zeng palette sorting method

andrews05 commented 8 months ago

This PR adds the modified zeng ("mzeng") palette sorting method, in addition to the existing luma and battiato methods. Speed is very similar to the battiato method with slightly better results on average.

Resulting sizes from two different image sets (all indexed or able to be indexed):		master	PR
Set 1	29,647,156	29,555,697
Set 2	23,732,133	23,570,862

Additionally, I've added a new "first colour" heuristic for both the mzeng and battiato methods: We use the most popular colour overall, but only if it covers at least 15% of the image. This provided 13k savings on Set 2 vs the edge colour heuristic (which is still used in the luma sort).

andrews05 commented 8 months ago

I actually will tidy this up a bit (it's based on C code from libwebp), so will convert to draft while I do that...

ace-dent commented 8 months ago

@andrews05 - nice! I'm working on an image set, where I have been brute-forcing the palette order. I'd been keen to test this method too!

One thing I've noticed, is occasionally the existing palette is already the most optimal. Ideally this should be evaluated too (i.e. a 'no-sort' if the starting image is indexed). This will also help if oxipng is used in conjunction with other optimizers. Or perhaps preserving the palette should be behind one of the no-reduction switches?

andrews05 commented 8 months ago

Brute-forcing, like all possible permutations? Wow, that's dedication 😅. How are you doing that?

Oxipng does evaluate the existing palette too. You can disable all palette sorting if you need to with --np.

Winterhuman commented 7 months ago

I have also been brute-forcing a bunch of OxiPNG permutations (including two --np variants) to find the best results, and can agree, it does take a lot of dedication: https://gist.github.com/Winterhuman/21d7b148db40ff041f397b07a7aafb83

andrews05 commented 7 months ago

@Winterhuman nice!

I still think it's odd that you're using pngquant in a lossless mode though, which is not what it was built for. You'd be better off swapping it for one or more other lossless optimisers, such as: PNGOUT, AdvPNG, PNGCrush, ECT, OptiPNG, PngOptimizer, pngwolf etc. Piping oxipng output into ECT, for instance, could be quite effective as it has a more advanced deflate compressor.

Existing png multitools include ImageOptim, FileOptimizer, and @ace-dent's pngslim.

andrews05 commented 7 months ago

Right, I've updated with much cleaner, less C-looking code. I've also made a change to the first colour to improve savings even more (see updated description at top).

ace-dent commented 7 months ago

@andrews05 - quick reply while our time zones are in sync (!) ~

My working theory was that popular colour made sense at index 0, because in most images the most popular colour is the edge colour. This then aligns with the row start of line, when the filter 0. Could you share some of your test images where this theory fails? Do you have some insights into why most popular colour succeeds, and why luma is different?

Edit: Ahh... I think I have some idea. But would love to hear your theories / insight. :-) Edit2: The 20% (or some other threshold), to then put your faith completely in battiato / mzeng makes sense. Did you try the 20% cut-off criteria with edge colour and those algos?

andrews05 commented 7 months ago

Yeah you're right, the most popular colour and most popular edge colour are usually the same, and even when they aren't the choice between the two often doesn't make much difference.

The difference is likely more around situations where the "most popular" isn't actually that popular and putting it first may actually be detrimental. This could happen e.g. in high-detail or photographic material where there are many colours but none that are particularly prominent. These would also likely be image types where the luma sort would not be effective, which is why it may better to retain the edge colour there. The increased diversity by doing different things in different sort methods may also be beneficial.

I haven't tried the 20% threshold with the edge colour yet. I don't anticipate that would be effective, but can certainly try. Of course, I've spent much time already tweaking and experimenting with various things and I have to stop at some point 😄

I don't have the image set with me right now but I can post some samples later.

ace-dent commented 7 months ago

@andrews05 I've spent a few hours thinking about this... My suspicion is the 20% threshold is really a proxy for whether row filter 0 is effective (and Luma sort is as good as any)... or where there is no single dominant 'background' colour, more complex filters (3 / 4) are effective... in which case we shouldn't be tinkering with the algorithms' palette order.... perhaps?

Corpus is going to have a big effect here. For small pixel art / icons, nearly always have filter 0. Larger screenshots often favour more complex filters (mix of all).

I haven't tried the 20% threshold with the edge colour yet.

I do think this is worth testing 🙏🏻 ... BTW, how did you arrive at 20%? Thanks for all your hard work and effort here. It's really exciting and appreciated. 😄

andrews05 commented 7 months ago

I've made some more tweaks: Lowered the threshold to 15%, and removed the final rotation of the array in the mzeng method. Saved around 2k total across both sets.

@ace-dent I tried the 20% with the edge colour - it made negligible difference (regressed a tiny amount). The 20% was just the result of a handful of trials in 10% increments. Nothing super scientific. Here's the image the saw the biggest improvement: BeigePyraBack

ace-dent commented 7 months ago

@andrews05 - Thanks! Will look over the image with a coffee :-) [Update: Wow! That's a lotta pixels to optimize!]

Lowered the threshold to 15% ...

I'm still thinking about this heuristic... I assume we cannot know the row filters, at the stage we are doing the palette sort? (I appreciate they may strongly interrelate, so sanely we might pick which to optimise for first).

ace-dent commented 7 months ago

Feature request: is it possible with -v -v to log some information about the palette sorting? Ideally in the 'Eval:' line or a summary line? e.g. Eval: 8-bit Indexed (256 colors - M-Zeng ordered)... or ...(256 colors - original order)..., etc.

andrews05 commented 7 months ago

@ace-dent Correct, we don't know the filters. The filter most commonly ends up being either None or one of the heuristic ones. Rarely is it exclusively one of the other delta filters, so trying to optimise for anything other than 0 is probably not helpful.

And yeah, it would be nice to have more info on those eval lines which are currently identical. I'll see if I can do something about that after this PR.

ace-dent commented 6 months ago

Hi @andrews05 - sorry to necropost... I have done some extensive trials with oxipng's new palette sorting. In most cases it performs very well, compared to a more brute force approach (testing against ~20 sorting variations per image). Typically it performs equally to my brute method, occasionally performs noticeably better, occasionally worse.

I have attached a small corpus of indexed images (GPL licensed), for future algo tuning. I can provide examples of winning and losing paletted images too, if it helps?

Many thanks. KolibriOS-indexed.zip

andrews05 commented 6 months ago

@ace-dent Nice. Yeah, some examples of losing ones could be interesting, along with what algorithm it lost to and by how much.

ace-dent commented 6 months ago

@andrews05 - not sure when I will be able to do forensics on the images, so have attached them as is. Will try to review in future.

OXIPNG-pal-fail.zip

shssoichiro / oxipng

Add modified zeng palette sorting method #602