It should be possible to make the code faster by using iterators instead of indexing, which lets the compiler eliminate bounds checks on every lookup. Using iterators also makes the code more readable.
I have WIP conversion of the horizontal pass to iterators that works and doesn't exhibit any artifacts, but has a higher effective blur radius. It can be found here: https://github.com/Shnatsel/fastblur/tree/iterators
I probably will not have the time to complete it, but somebody familiar with the algorithm should be able to fix the blur radius fairly easily.
It should be possible to make the code faster by using iterators instead of indexing, which lets the compiler eliminate bounds checks on every lookup. Using iterators also makes the code more readable.
I have WIP conversion of the horizontal pass to iterators that works and doesn't exhibit any artifacts, but has a higher effective blur radius. It can be found here: https://github.com/Shnatsel/fastblur/tree/iterators I probably will not have the time to complete it, but somebody familiar with the algorithm should be able to fix the blur radius fairly easily.