Open AsakusaRinne opened 2 years ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics See info in area-owners.md if you want to be subscribed.
Author: | AsakusaRinne |
---|---|
Assignees: | - |
Labels: | `area-System.Runtime.Intrinsics`, `tenet-performance`, `untriaged` |
Milestone: | - |
Beside the codegen-thing and as you're on .NET 7:
Instead of TWrapper wrapper = new();
you could use static abstract interfaces (coming with C# 11 and .NET 7) to avoid that line entirely.
Yes, thank you, It's a nice feature of C# 11 and I'm updating my library to C# 11 recently. The performance loss could be avoided, however I think the reason behind this behavior may be interesting :)
By the way, If I need to keep my library compatible with previous .NET core versions, I cannot use new interfaces like IAddationOperators
, which is implemented by all the number types. Do you think there's still a good way to use static abstract interface
to optimize the code? My library were full of code like TWrapper
above and that's annoying.🤣
Description
I'm trying to use
Vector<T>
andVector256<T>
recently. Most of the things work well but a strange performance loss ofVector<T>
appears on my Ubuntu Server. The performance ofVector<T>
became nearly 3 times slower when I simply exchange the order of two lines of my code. Furthermore, In my Windows machine, this exchange did not cause obvious performance loss, which is very confusing.Configuration
On Linux, my configuration is listed as below.
.NET7.0 RC
and.NET 6.0
are both tested.On Windows, my configuration is listed as below.
Regression?
Sorry that I'm not sure.
Data
In my benchmark test, I take the way of using
Vector256
as a comparison and test the performance ofVector<T>
. I'll give out the main body of my code and result first, then append the full code at the end of this section.The main part of the test is the code below.
The result on my Linux server is listed as below (.NET 7). The result of .NET 6 is almost the same.
However, if we just exchange Line A and Line B above, the performance of
Vector<T>
becomes much slower:What's more, this exchange has little impact on my Windows machine, which is very confusing.
The integral code of my benchmark test is listed as below.
Besides, I also tried to add
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
and[MethodImpl(MethodImplOptions.AggressiveInlining)]
but they did not work.Analysis
Sadly I'm too confused to figure out the problem. I regard
Vector<T>
as a wrapping ofVector256
when theavx
is supported. Thus the performance ofVector<T>
should be slightly slower thanVector256
with small data and be close toVector256
with large data. Is that right?I will appreciate it if anyone could help to explain it or share some references to me.