The entropy strategy implemented with integer gives worse compression than float version

lvandeve / lodepng

PNG encoder and decoder in C and C++.

zlib License

2.04k stars 420 forks source link

The entropy strategy implemented with integer gives worse compression than float version #116

Closed JayXon closed 4 years ago

JayXon commented 4 years ago

One of the regressions after updating lodepng in zopfli is that it choses entrop strategy less often than before. 20-c6-shouldbe-c4-rgbdata For example this file, zopflipng --lossy_transparent now chooses zero strategy and optimize it to 5811 bytes while previously it chooses entropy and optimize it to 5642 bytes. Even if I force entropy strategy with --filters=e, 5719 bytes is still not as good as before.

I nailed it down to this commit https://github.com/lvandeve/lodepng/commit/3d639635bb1b00ef21d11ce478373991c3eca1d5 and verified that after reverting it can now optimize to 5642 bytes like before.

What's the rationale behind removing usage of float? Is it for speed? I did not notice any speed difference in my testing.

lvandeve commented 4 years ago

I could consider implementing this strategy with floating point in zopfli itself. The rationale for removing float is to keep the base library entirely integer, to be compatible with very small platforms

However, floating point will likely appear again as an option in the future for color profiles (gamma correction, ...)

zvezdochiot commented 4 years ago

https://github.com/lvandeve/lodepng/blob/c84cff3a5411bd15c70a18c900852b20f32077cc/lodepng.cpp#L5461-L5475

Maybe?:

 size_t sumt = 0;
...
      bestSum = 0;
      bestType = 0;
      for(type = 0; type < 5; ++type) {
        filterScanline(attempt[type], &in[y * linebytes], prevline, linebytes, bytewidth, type);
        for(x = 0; x < 256; ++x) count[x] = 0;
        for(x = 0; x < linebytes; ++x) ++count[attempt[type][x]];
        ++count[type]; /*the filter type itself is part of the scanline*/ 
        sumt = 0;
        for(x = 0; x < 256; ++x) {
          sumt += (count[x] == 0) ? 0 : ilog2(count[x] * count[x]) * count[x] / 2;
        }
        if(sumt > bestSum) {
          bestType = type;
          bestSum = sumt;
        }
      }

lvandeve commented 4 years ago

Another formula to try for an improved integer approximation:

ilog2(x) * x + x - (1 << ilog2(x))

zvezdochiot commented 4 years ago

@lvandeve say:

ilog2(x) * x + x - (1 << ilog2(x))

Diff:

f0 = flog2(x)*x
f1 = ilog2(x)*x 
f2 = ilog2(x*x)*x/2
f3 = ilog2(x)*x+x-(1<<ilog2(x))
f4 = ilog2(x)*x+(x-(1<<ilog2(x)))<<1
----
x   1   2   3   4   5    6    7    8    9    ... 15   16   ... 255    ... 1023
f1  0   2   3   8   10   12   14   24   27   ... 45   64   ... 1785   ... 9207
f2  0   2   4   8   10   15   17   24   27   ... 52   64   ... 1912   ... 9718
f3  0   2   4   8   11   14   17   24   28   ... 52   64   ... 1912   ... 9718
f4  0   2   5   8   12   16   20   24   29   ... 59   64   ... 2039   ... 10229
f0  0.0 2.0 4.8 8.0 11.6 15.5 19.7 24.0 28.5 ... 58.6 64.0 ... 2038.6 ... 10228.6

Good, but f4 - best:

 size_t cnt = 0, lncnt = 0, sumt = 0;
...
        for(x = 0; x < 256; ++x) {
          cnt = count[x];
          lncnt = (cnt == 0) ? 0 : ilog2(cnt);
          sumt += (cnt == 0) ? 0 : (lncnt * cnt  + (cnt - (1 << lncnt)) << 1);
        }

zvezdochiot commented 4 years ago

But should not there be a harmonic mean, and not an arithmetic mean?

size_t lnbt = 0, cnt = 0, lncnt = 0, sumt = 0;
...
        lnbt = linebytes + 1;
        for(x = 0; x < 256; ++x) {
          cnt = count[x];
          lncnt = (cnt == 0) ? 0 : ilog2(cnt);
          sumt += (cnt == 0) ? 0 : (lnbt << 8) / (lncnt * cnt  + (cnt - (1 << lncnt)) << 1);
        }
        sumt = (sumt == 0) ? 0 : (lnbt << 16) / sumt;

! linebytes < 65535. Otherwise w=(lnbt<<4), 256w=(lnbt<<12)

lvandeve commented 4 years ago

Yes indeed, f4 is best, I forgot a factor of 2 above.

Implemented in https://github.com/lvandeve/lodepng/commit/727ea3f0eb245a06e2de35bdfc62dc760a34f62d

It works well for zopfli on the attached image file with the --lossy_transparent setting and without giving --filters=e

I think this solves the issue nicely without needing to add it with float in zopfli, leaving open for now for further feedback

Thanks for the help zvezdochiot!

zvezdochiot commented 4 years ago

@lvandeve say:

leaving open for now for further feedback

Not only. It is worthwhile to conduct an additional set of tests on sub-issues. The topic is not exhausted.

JayXon commented 4 years ago

The new algorithm is actually better than the previous float version! Thanks!