wvanbergen / oily_png

Native mixin to speed up ChunkyPNG.
http://www.chunkypng.com
MIT License
175 stars 16 forks source link

SIMD? #10

Closed lencioni closed 10 years ago

lencioni commented 10 years ago

I don't have any experience with C programming, but I am wondering if you might be able to see some performance improvements by using SIMD programming in a few places.

For instance, do you think it would benefit oily_png_compose_color to use SIMD programming? There might be a few other places where SIMD could be implemented without too much effort. But, since I have no C experience I am likely underestimating the difficulties involved.

More information on SIMD:

wvanbergen commented 10 years ago

It probably could, but it's unlikely that I will be working on this myself any time soon, due to the lack of time and experience with SIMD.

Another complication is that OilyPNG can only mirror de methods in ChunkyPNG. It may be required to organize ChunkyPNG differently as well in order to apply SIMD techniques effectively. That's fine with me but will make the implementation more complicated.

Closing this for now, unless we can find somebody to work on this.

lencioni commented 10 years ago

Thanks for the response!

I dug into our particular performance issues, and it actually looks like our current bottleneck is #set_pixel, which appears to be doing a simple integer assignment into an array in Ruby. To see, in this flamegraph, the spikey blue stuff on the right is mostly #set_pixel.

Perhaps there is a faster data structure that we could use here?

wvanbergen commented 10 years ago

In general, I find that minimizing the amount of calculations and method calls that happen in Ruby-land, gives you the most bang for the buck. Moving them to C/OilyPNG will give a big performance improvement.

So instead of implementing set_pixel, I would implement the method that calls set_pixel many times in C. This will seriously reduce the method call overhead.