Closed woot000 closed 2 years ago
Thank you for the pull request. ECT's current lodepng is very difficult to maintain due to the hack and slash changes I made years ago before adopting proper code style, so bringing it closer to main lodepng can certainly be helpful. This is a whole lot of code to look at though, so I'll need a while to go through it.
I left some of the LodePNG functions unchanged (mostly some of the pixel decoding functions, an example being getPixelColorRGBA8) because I was having some issues with corrupted output, and I couldn't find the exact function that was responsible for those issues (it might be unfilterScanlines but I'm not sure)
You could probably also update those too for a small boost in decoding speed
Overall this is a whole lot of changes for one commit, making it difficult to review. It would have been useful to split it up into several commits (e.g. refactor, update to current lodepng, several commits with changes...).
I will take a look and first update the code base to current lodepng (whole lot of changes already) and then try to identify useful changes in here going beyond that.
I looked at the changes to mainline lodepng in the past few years and adopted many of them on the new lodepng-work branch. I plan to merge it after testing the changes, overall it should make little difference in terms of performance.
As far as I can see, your pull request includes useful changes for optimize_palette, filterScanline and filter (let me know if I'm missing something) on top of that, but also includes lots of code style changes that deviate from mainline lodepng. I'm trying to align this code more closely with it so that it is easier to maintain in the future, so some of the code style changes are not helpful.
Perhaps you can file smaller pull requests based on the new branch that reduce code size/improve performance, it should just be commits that are smaller so they are easier to review.
The optimizations are primarily in filter, filterScanline, and optimize_palette (I may be forgetting a few). I'll resubmit this pr with those changes in smaller bunches of commits soon
Backport and adapt newer functions with previous optimizations, further optimize some existing functions, update formatting
These changes decrease the size of the LodePNG compiled code and speed up optimizing PNGs using the --allfilters switch
On MSYS2 MINGW64 gcc 12.2.0, LodePNG code is
On MSYS2 CLANG64 clang 14.0.6, LodePNG code is
There's a ~0.5% speed increase for the --allfilters-b switch, otherwise there is no measurable speed difference when optimizing PNGs without using --allfilters or --allfilters-b