google / guetzli

Perceptual JPEG encoder
Apache License 2.0
12.9k stars 976 forks source link

consider filesize when selecting "best" output image #207

Open DeeDeeG opened 7 years ago

DeeDeeG commented 7 years ago

I have been using Guetzli, and I like how it avoids many common artifacts from other encoders. It would be nice to know I'm getting the best file size for my visual quality though, because Guetzli isn't as aggressive on file size as other encoders, obviously. This would be a step in that direction.

As far as I can understand, after reading the research paper, Guetzli currently:

Specifically it does this:

However I think it should instead do:

My reason for suggesting this is, right now Guetzli might be generating outputs with aggressive compression, low file size, and with the highest psycho-visual quality, and then skipping these for ones that are closer (lower quality) to the arbitrary Butteraugli scores defined in the index. (Guetzli currently sets both a lower and upper bound for output visual quality, irrespective of filesize.) Wouldn't it be better to set a minimum quality from the index, then optimize for filesize from there on out, rather than trying to constrain so tightly to the exact, completely arbitrary index value?

I see output filesizes during the various steps, when using --verbose, so I think grabbing the output filesizes shouldn't be hard. Am I misunderstanding? Is this a good idea? Any thoughts appreciated.

(glossary: when I say "visual distance" or "distance," I mean "visual difference comparing a given output image to the input image." determined by Butteraugli; when I say "index" I mean the "Q value to Butteraugli score index" from here: link. I am using these words like they're used in the paper.)

DeeDeeG commented 7 years ago

Example of the problem I want to fix:

Assuming:

Currently, Guetzli will choose output A, even though it is a larger file size, and worse visual quality than output B.

I suggest that in this scenario, Guetzli should determine that output B is better in both ways, and select output B.

(note: I made up a quality scale where higher quality is better, but in Butteraugli, a lower score is better. I just made something up for the sake of example.)

szabadka commented 7 years ago

Could you provide concrete examples? In the form of original input, guetzli output, and something that has better quality (according to butteraugli) and smaller file size.

DeeDeeG commented 7 years ago

Hi @szabadka. I am not looking for an alternative encoder that is better than Guetzli. I just want to improve Guetzli, if possible. I hope Guetzli will use a more file-size-oriented approach than it does now.

The "outputs" I am talking about all are made by Guetzli during a single run. (To see info about the multiple "outputs" Guetzli produces during a single run, try running Guetzli with --verbose. I think there are something like 40 encodes or tweaks made to the input image, in a couple of stages.) My suggestion is to do with how Guetzli picks the best one of these, in other words how it picks which version should be the final output image.

(Guetzli just saves the "best" one to disk and gets rid of the rest. So that means I don't have actually different outputs saved to my hard disk that I could compare and contrast.)

In any case, the "concrete" example would have to be made by tweaking the code, and comparing the outputs of GitHub/master Guetzli vs the output of tweaked Guetzli.