Closed dbyoung720 closed 1 year ago
This repo is meant to be a simple to understand implementation. Both of your suggestions, while they might increase speed, obfuscate what's going on.
can't. Just a small optimization
What I do in my QOI variants is allocate two scanline as part of the final image (so there is still a single malloc), and use that as double-buffered RGBA8-only scanlines just for the purpose of prediction. I don't think it's faster, just more flexible with prediction, and it's a bit tricky to get right.
From the README:
The QOI format has been finalized... ... ... pull requests for performance improvements will probably not be accepted
based on the benchmark results, is it even practical to squeeze more performance out of it? perhaps if you constantly ran-into the edge case images which cause a problem.
seems like the performance is fine, though. especially considering it's not doing the row filters or DEFLATE/INFLATE like PNG does.
if anything is a genuine argument for improvement, it's covering 8 and 16 bit image encodes. [something a LOT of games/engines should be using, but don't]
but yep, spec is finalized, so moot point to bring it up here.
Two small suggestions about optimization:
Since qoi only supports 24bit、32bit image encode, so can write two functions to process 24bit、32bit images separately. In this way, can skip the problem of constantly determining whether it is 24bit or 32bit in the loop. increase of speed;
In the encode function, the maximum length of bytes is 5, (469 --- 473 line, qoi.h) so we can define an array of 6 bytes to record and save the middle value, write it, and shift the address, at the end of code. increase of speed;