linkdotnet / LinqSIMDExtensions

A LINQ-like extensions for C# that uses SIMD extensions to parallelize the operations
MIT License
37 stars 0 forks source link

Would it be possible to add SimdLinq to the benchmark? #55

Closed netcorefan1 closed 8 months ago

netcorefan1 commented 8 months ago

Hello, as far as I know, LinqSIMDExtensions and SimdLinq are the only one implementations I saw in Github. SL is made from a very talented developer well known for its zero allocation libraries. I'm not sure if SL is still maintained, but would be interesting anyway if you could include that library in the benchmarks to see how both performs. Thanks

linkdotnet commented 8 months ago

Good point - I will update the benchmarks so you can run them locally if you wish. Anyway, I will present here some results for my MacBook M2 on net8.0:

Sum

Method Mean Error StdDev Ratio
LinqSum 314.31 ns 0.711 ns 0.665 ns 1.00
LinqSIMDSum 63.87 ns 0.387 ns 0.343 ns 0.20
SIMDLinq 118.48 ns 0.256 ns 0.240 ns 0.38

SequenceEquals

Method Mean Error StdDev Ratio RatioSD
LinqSequenceEqual 958.0 ns 7.75 ns 7.25 ns 1.00 0.00
LinqSIMDSequenceEqual 1,200.2 ns 0.97 ns 0.76 ns 1.25 0.01
SIMDLinqSequenceEqual 1,492.5 ns 15.15 ns 14.17 ns 1.56 0.02

Min

Method Mean Error StdDev Ratio
LinqMin 167.03 ns 0.982 ns 0.870 ns 1.00
LinqSIMDMin 61.05 ns 0.210 ns 0.175 ns 0.37
SIMDLinqMin 118.78 ns 0.322 ns 0.269 ns 0.71

Average

Method Mean Error StdDev Ratio
LinqAverage 1,002.11 ns 10.840 ns 10.140 ns 1.00
LinqSIMDAverage 91.33 ns 0.143 ns 0.111 ns 0.09
SIMDLinqAverage 172.94 ns 3.367 ns 3.877 ns 0.17

Contains

Method Mean Error StdDev Ratio RatioSD
LinqContains 6.769 us 0.0227 us 0.0201 us 1.00 0.00
LinqSIMDContains 7.593 us 0.1152 us 0.1078 us 1.12 0.02
SIMDLinqContains 6.856 us 0.0400 us 0.0354 us 1.01 0.01

Take those with a grain of salt - my library was built natively against net8.0 and might doesn't cover all the cases that SimdLinq does. Furthermore, dotnet is more and more utilizing Vector for LINQ operations internally. So you might come around using a 3rd Party library altogether. But as always, measure first before deciding.

Hope that helps you a bit.

EDIT: Feel free to close the ticket if that answers your question.

netcorefan1 commented 8 months ago

WOW! This is simply amazing! You're implementation is twice faster on most the benchmarks!

Take those with a grain of salt - my library was built natively against net8.0...

Do you means that such great results can only be expected in .Net 8 and higher? If so, then it would be perfectly fine (NetStandard and many others are just legacy frameworks used as fallback when there are compatibility issues).

... and might doesn't cover all the cases that SimdLinq does.

From what I see SimdLinq also support longsum and minmax methods, but I really can't find a real usage for these twos.

Furthermore, dotnet is more and more utilizing Vector for LINQ operations internally. So you might come around using a 3rd Party library altogether. But as always, measure first before deciding.

I must admit this surprises me. When Microsoft published SIMD samples I believed their intention was to leave developers the job especially when implementations are suitable for extension methods. This might explain why SimdLinq development seems to be abandoned. LinqSIMD is very recent (if I'm not wrong initial release is around 8 months ago). Any particular reason for such efforts in development rather than waiting for a full transition on DotNet side?

linkdotnet commented 8 months ago

Do you means that such great results can only be expected in .Net 8 and higher?

Well building against a more modern SDK brings always a bit more benefit in performance/allocations (compiler evolution, more mature SDK and so on) - but I don't have any numbers here. I don't think that this would explain the big gap.

From what I see SimdLinq also support longsum and minmax methods, but I really can't find a real usage for these twos.

longsum would work out of the box right now, as LinqSIMDExtensions is utilizing the "abstract math" concept introduced as a preview in .net 6 and publicly available since .net 7.

I must admit this surprises me. When Microsoft published SIMD samples, I believed their intention was to leave developers the job, especially when implementations are suitable for extension methods.

If you have a look at the performance improvement blog posts from Stephen Toub:

And search for "SIMD" you can see hits like:

Just two examples from each blog post. They will not push SIMD onto every single LINQ operation. Especially for small data sets that seem counterintuitive and add some pitfalls: Imagine a list like [int.MaxValue, 1]. Trying to create a sum of this via LINQ will result in an exception - that doesn't necessarily happen with SIMD.

Currently, LinqSIMD does not handle that case. I thought about a switch to explicitly enable that behavior to detect overflow and underflow (of course that would come with a runtime cost - therefore opt-in).

My motivation was to have a unified API that is easy to use and evolves "automatically" with the capabilities of generic math and the ones from Vector (and all the SIMD stuff in .net in regards to the various architectures). Like many libraries, it started out as an interest. By far, it is one of the smallest ones I maintain.

For me it was very interesting to see that people noticed that library :D - so thanks for the message @netcorefan1

netcorefan1 commented 8 months ago

Thanks for your detailed explanations, I really appreciate. So, if I have understood well, .Net 8 and future versions will not make LinqSIMD obsolete and redundant and I should see LS as a complementary library which provides useful helpers, extensions methods and unavailable features.

Regarding overflows and wrong detection usage, I'm not sure is something you should take care and focus on. As soon as an user get into LinqSIMDExtensions namespace, he should know he is dealing with SIMD and the unsafety of certain operations. Probably better a "What not to do" list in the homepage? Or may be something like SimdOperation.DoThis() and SimdOperation.DoThisUnsafe() ?

I want to encourage you to continue development of this wonderful library. Many people are not still aware of SIMD, but numbers showed from benchmarks are really impressive and make me think that this is just a matter of time before we will see this technology everywhere (AI, Graphics etc).

linkdotnet commented 8 months ago

First of all, thanks for your support - if you have any improvements or wishes, just let me know. I will release a version where you can do the following:

var numbers = new[] { 1, 2, 3, 4, 5, 6, 7, 8 };

var average = numbers.Average<int, double>();

average.ShouldBe(4.5);

That should overcome some of the limitations the library has at the moment. Also thinking of doing the same for Sum so one could do something like: [1, 2,3].Sum<int, long>() to get bigger values than the input. That would also enable scenarios like Product that doesn't even exist in LINQ itself.

Regarding overflows and wrong detection usage, I'm not sure is something you should take care and focus on. As soon as an user get into LinqSIMDExtensions namespace, he should know he is dealing with SIMD and the unsafety of certain operations. Probably better a "What not to do" list in the homepage? Or may be something like SimdOperation.DoThis() and SimdOperation.DoThisUnsafe() ?

Yeah - have the same view. The thing is that when using SIMD you should be aware of what is going on - at least that is my hope. That said, I will keep it in v1 for sure "unsafe".

I want to encourage you to continue development of this wonderful library. Many people are not still aware of SIMD, but numbers showed from benchmarks are really impressive and make me think that this is just a matter of time before we will see this technology everywhere (AI, Graphics etc).

Absolutely! And even ARM processors have more and more SIMD capabilities! It would be a shame not to utilize those resources!

netcorefan1 commented 8 months ago

Your idea is very ingenious. However I would focus more to convert the most used LINQ methods. LINQ is very slow and allocates a lot. During these years I saw several low allocation LINQ replacements which tried (not very successfully, had to say) to mitigate such problems. I'm wondering if such libraries still make sense and if you have been able to fully forget about them without any renounce after the transition to SIMD counterpart.

I think that most people (included me) will try to use SIMD linq after reading no more than a just a couple of rows from the help page (just the minimum to start). Then, after trials they will soon realize they need to know a bit more and will return back to the help page. As long as the help page and/or the xml docs warns about what need to be avoided, I think you should not even waste your time and efforts in changes to the code.

I don't want to abuse your time. Unless there is something else you would like to share to get some feedback, I will close the issue as solved.

linkdotnet commented 8 months ago

Your idea is very ingenious. However I would focus more to convert the most used LINQ methods.

That makes sense - and if you are missing obvious candidates let me know. The major restriction is, that SSE instruction set has to offer a surface (like Sum, Min, max and so on) to be able to do that. Obviously some bit-shifting is also in the realm of possible things.

I don't want to abuse your time. Unless there is something else you would like to share to get some feedback, I will close the issue as solved.

No, please. You are not abusing/wasting/using my time - I do enjoy some conversation around those topics. So if you have any input, just let me know.

netcorefan1 commented 8 months ago

Of course, I will do although not very soon because now I had not yet chance to try SIMD linq extensions (I'm doing a general restyling of my helper libraries and SIMD is on my list although with a lower priority). I prefer to get inside SIMD with the time that deserve rather that trying some experimental quick tests now. What I have done immediately is to reference your NuGet package with the following tag: // ToDo: this is an amazing library done from a very talented guy. The old linq code needs to be replaced with LinqSIMDExtensions.