Open chocolate42 opened 3 months ago
do the size gains persist with compression applied on top (e.g. lz4/ zstd)?
Presumably it's beneficial overall. Qoi is beneficial as a preprocessor (qoi.lz4 tends to be smaller than raw.lz4), this change should include alpha changes in this benefit. Worth testing if you want, I will eventually but my goals lie elsewhere (I want to make a qoi-like that is simd-friendly and see how fast I can make it, this change is part of that).
BTW if your goal is smaller filesize after compression, you might want to disable the index op. It often gets in the way of the compressor resulting in a larger filesize.
I got less lazy and did some tests. OP_RGBA remains beneficial with compression, and disabling the index op improves compression when the compression is not very light. Both together is even better.
images-lance, MB
568.16 qoi.qoi
487.41 qoi.1.lz4
404.86 qoi.6.lz4
403.40 qoi.12.lz4
652.34 qoi.noindex.qoi
506.92 qoi.noindex.1.lz4
389.93 qoi.noindex.6.lz4
386.07 qoi.noindex.12.lz4
493.11 qoia.qoi
449.20 qoia.1.lz4
383.04 qoia.6.lz4
381.89 qoia.12.lz4
539.64 qoia.noindex.qoi
452.05 qoia.noindex.1.lz4
362.89 qoia.noindex.6.lz4
359.62 qoia.noindex.12.lz4
Fixed a bug which very rarely caused encode to miss end of file. Took testing with thousands of images to catch. Not going any further with qoia I just couldn't leave the bug present here.
Taking this RGBA op and using it as part of a qoi-like (roi) does this:
# Grand total for ../qoipond/images
decode ms encode ms decode mpps encode mpps size kb rate
libpng: 3.4 31.9 137.01 14.55 395 24.1%
stbi: 4.1 35.2 114.37 13.17 561 34.2%
qoi: 1.5 1.6 313.21 292.18 463 28.2%
roi: 1.2 1.4 379.94 326.05 455 27.7%
roi.lz4: 1.2 1.6 373.39 284.14 415 25.3%
roi.zstd1: 1.4 2.0 321.42 237.49 358 21.8%
roi.zstd3: 1.5 2.8 314.58 167.68 350 21.4%
roi.zstd9: 1.5 5.3 310.43 87.80 344 21.0%
roi.zstd19: 1.6 56.3 283.37 8.24 335 20.4%
# Grand total for ../qoipond/images-lance
decode ms encode ms decode mpps encode mpps size kb rate
libpng: 24.0 208.6 168.94 19.47 1374 8.8%
stbi: 29.5 251.5 137.84 16.15 2119 13.6%
qoi: 12.3 9.4 331.03 430.51 2109 13.6%
roi: 10.4 8.6 392.23 470.30 1993 12.8%
roi.lz4: 10.7 10.8 377.89 376.54 1629 10.5%
roi.zstd1: 11.5 12.2 352.46 333.12 1217 7.8%
roi.zstd3: 11.9 16.6 342.63 245.28 1161 7.5%
roi.zstd9: 12.0 37.3 339.47 108.79 1095 7.0%
roi.zstd19: 12.3 520.6 329.45 7.80 1032 6.6%
With zstd compression level 1 roi.zstd beats png and qoi on filesize for both rgb and rgba whilst still being quicker to decode than qoi. This is before trying to speed up roi using simd. roi is fairly simple itself, containing the following ops:
LUMA232
LUMA464 (the qoi LUMA op, R and B 4 bits stored as diff from G, G stored in 6 bits)
LUMA777
OP_RGB
OP_RGBA as described in this thread, opcode+alpha+[LUMA232/LUMA464/LUMA777/OP_RGB]
RUN op with values 1..30
roi is more simd-able because
I experimented with a qoi-like (soi) where all the luma ops had the same number of bits for each plane (222, 555, 777), which allows for simpler simd, however the space efficiency lost relative to roi meant that further compression had more data to work through, to the point where strong compression would spend more extra time than could possibly be saved by soi however accelerated soi could be made with simd, with worse overall compression. So I think roi is an optimal choice, and I'll attempt to simd it in the next weeks.
You could fork, rename and modify it, so others would possibly know about it. Issue like this is going to be missed. Unless you don't want gh to know your code, which is understandable.
Instead of the current 5 byte fixed encoding every time there's an alpha change, do a 2 byte alpha change followed by OP_DIFF/OP_LUMA/OP_RGB to encode the rgb elements. This changes nothing for RGB images, can regress slightly some RGBA images if too many 6 byte encodings are used, but on average greatly improves RGBA space-efficiency. Works best when images have a lot of nuanced alpha gradients.
qoia-v2.zip