Closed gfoidl closed 1 year ago
Oof! Look at those numbers!! Thanks so much for looking at this. I'm going have a good dig through it to wrap my head round what you have done.
😃
I think the easiest way to understand / check is to use the benchmark-code (see top-comment) in a simple console app and step with the debugger through it (maybe change the size of the _rowSpan
to 20 or that like.
Calculation of the correct index for end
is the strangest part IMO.
PS: I'm back on Tuesday, so maybe slow to respond in the meantime.
This is fantastic stuff. I figured theoretically after reading some of the source for Span.IndexOf
that masking with bit counting would be the vectorized solution I just had no idea how I'd actually implement it.
Tip of the cap to you sir.
Thanks for the kind words ❤️
Prerequisites
Description
See https://github.com/SixLabors/ImageSharp/pull/2455#issuecomment-1613901985
A simple benchmark -- just for the inner loop -- yields:
This is measured with .NET 7, but the codegen for .NET 6 is very similar.
benchmark code
```c# using System.Runtime.CompilerServices; using System.Runtime.InteropServices; using System.Runtime.Intrinsics; using BenchmarkDotNet.Attributes; Bench bench = new(); bench.Setup(); Console.WriteLine(bench.Default()); Console.WriteLine(bench.Vectorized()); #if !DEBUG BenchmarkDotNet.Running.BenchmarkRunner.Run