phoboslab / qoi

The “Quite OK Image Format” for fast, lossless image compression
MIT License
6.85k stars 327 forks source link

The hash function could be improved #211

Closed MarkJeronimus closed 1 year ago

MarkJeronimus commented 2 years ago

The hash function is pretty bad.

Try it on this image: BA1959A1 Six of the seven colors have the same hash value. The colors are all cyan, and from a 12BPP RGB cube.

For R=0, G=B=16*N (N integer), and A=255, the hash is always 53. The reason for this is that the sum of the hash multipliers for G and B (5 and 7 respectively), multiplied by 16 is 192, which is an integer multiple of the table size (64).

I reduced the table size to 63 (which is relatively prime to 256 but not the prime multipliers), and the compression ratio immediately improved by 6.95%.

I changed multipliers to [5, 11, 13, 17] which are relatively prime to 63 (also taking into account their sums), and the ratio improvement is now 6.36%

🤔

I think it's random noise at this point because just reordering the multipliers changes the ratio by even larger amounts.

Anyway, my point stands, which is that a hash table of 64 is bad.

[edit] curiously this big improvement only happens when using signed rgba values. When using unsigned, the ratios are 3.96% and 4.78%, respectively (new multipliers being better). Note that with a hash table size of 64, signed and unsigned calculations both generated identical hash values. The signed version is trickier however, because in most languages, % (remainder) gives a negative result when the dividend is negative (i.e. truncMod). What's needed here is a true modulo operator (floorMod).

MarkJeronimus commented 2 years ago

Moving to qoi-bikeshed. you're free to delete this.