lvandeve / lodepng

PNG encoder and decoder in C and C++.
zlib License
2.07k stars 422 forks source link

Bug fix for previous erroneous pull request, plus further optimisation #2

Closed slowriot closed 9 years ago

slowriot commented 9 years ago

I found and resolved the issue that caused my previous untested pull request to fail (10bb7ca). I realise you said that your choice of post-increment and less-than comparison is a matter of style, but I've now benchmarked this fork against the original, and I consistently get runtimes taking 97% the time of the original to carry out the unit test (timed in batches of 100), both at optimisation levels -O0 and -O3, meaning these changes represent an approximately 3% speedup even with optimisation.

Not all of the changes will affect performance (mostly the loop comparisons will), so feel free to merge only part of this if you really feel strongly about keeping post-increments, although I personally think it's a matter of good style to prefer post-increment in all cases when working with C++ on principle.

lvandeve commented 9 years ago

Thanks a lot for your work! I'd love to integrate it, it still doesn't compile with C90 though. It reallly has to. Thanks :)

lvandeve commented 9 years ago

I already integrated 3b22b26 and 4547bfa. If this pull request would contain only the changes about the < to != and postincrement to deincrement, but not the change that breaks it for C90, and not the things already done, I'd gladly accept it. Thanks a lot!

slowriot commented 9 years ago

Sure, no worries. I've got an idea, will push a C90 compatible update to this pull request soon :)

slowriot commented 9 years ago

The version in this pull request should now work with ANSI C, C99, C11 and C++. If there are any other tests it fails that I haven't tried, please let me know what command you're testing with and I'll resolve those as well.

lvandeve commented 9 years ago

Hi,

What is the reason for moving size_t into the for loop? Does it make it faster?

I don't like the ifdefs... because now the declarations of i are there twice! It doesn't look that clean. I would prefer that the change involving moving variable declarations to non-C89 compatible places would not be there at all, ifdef protected or not.

I would love to pull in your change about preincrement and != instead of <.

Is it possible to provide a pull request which has only the preincrement and != change, but nothing else? It would be nice for me to have a choice of what to pull in. This pull request comes with quite some history... I reallly appreciate your work and like parts of the change, I just wouldn't want to pull in things I don't like. Remember I also already got the first two parts :)

Thanks a lot!

slowriot commented 9 years ago

A variable declared in the loop will have its scope within that loop; generally keeping scopes as narrow as possible is preferred, as it allows the compiler to optimise its register uses as much as possible. The fewer registers available on a given system, and the more variables you're using, the more important this becomes - otherwise variables have to be swapped out of registers and into memory and back more frequently, which can be a big drag. Often the compiler will do a decent job of optimising this away for you, but it never hurts to write things in the right way to begin with.

If you don't like the define switches that's fair enough, although I do find it a little odd considering how many define switches this same file and its header already have... it also seems strange to me to refer to clearly marked #ifdefs as not looking clean in this context, when half of the header file already hinges on __cplusplus ifdefs, and especially when the procedure for compiling in C is for the user to manually rename the entire file. But finally it seems the most strange to me to prefer a potential slowdown to adding a couple of lines of code, in an implementation that is performance-critical.

But it's your code and you're welcome to pull in what you do and don't want, of course! To make it easier I'll create a branch that's purely C89 compatible without the switches. But I certainly won't be running that branch myself :)