SadToBeNep commented 5 months ago

What?

I did some benchmarks on the already implemented Sum() function and tried to improve it's speed a bit (results bellow)

Why?

I did the benchmarks as part of my learning about C# code optimizations, and from that I know that the for loops can be sometimes slow

How we could improve it?

Well, there are obviously a lot of ways to do that, I think as an early development suggestion, it would be good if we used foreach instead of basic for loops even tho this isn't the biggest performance improvement, unrolling the loop is much faster, but again, for the sake of maintainability simply changing for to foreach is enough

Results of tests


BenchmarkDotNet v0.13.12, Arch Linux
Intel Core i5-8400 CPU 2.80GHz (Coffee Lake), 1 CPU, 6 logical and 6 physical cores
.NET SDK 8.0.101
  [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2

Method	lengthOfList	Mean	Error	StdDev	Ratio	RatioSD
TestUnroll	100	23.32 ns	0.115 ns	0.108 ns	0.51	0.00
TestIdomatic	100	42.61 ns	0.386 ns	0.342 ns	0.93	0.01
TestSum	100	45.62 ns	0.274 ns	0.243 ns	1.00	0.00

TestUnroll	1000	265.46 ns	1.074 ns	0.897 ns	0.69	0.01
TestIdomatic	1000	353.41 ns	2.807 ns	2.488 ns	0.92	0.01
TestSum	1000	383.83 ns	3.140 ns	2.783 ns	1.00	0.00

TestUnroll	100000	27,049.76 ns	213.029 ns	188.844 ns	0.71	0.01
TestIdomatic	100000	34,221.19 ns	370.153 ns	328.131 ns	0.90	0.01
TestSum	100000	38,184.94 ns	455.654 ns	426.219 ns	1.00	0.00

TestUnroll	1000000	278,427.73 ns	5,564.379 ns	5,204.924 ns	0.71	0.02
TestIdomatic	1000000	350,900.60 ns	6,282.049 ns	5,876.233 ns	0.89	0.02
TestSum	1000000	393,756.77 ns	7,274.756 ns	6,804.811 ns	1.00	0.00

Some explanation

Even tho the compiler does some optimization on the loop, we can achieve much better performance by implementing the loop in a way, where it's adding together source[x]+source[x+1]+source[x+3]... Using this method as we can see the improvements with really small lists of numbers(100) the loop is almost twice as fast, and after that it keeps a 20-30% faster numbers As for my suggestion of changing for to foreach the improvement are around 10% stable regardless of the size, and again maintaining the simpler solution is easier

Can try to implement it by myself if needed

dragon7307 commented 5 months ago

I'll do some benchmarks on my own, this is a feature I am working on currently, and get back to you...

dragon7307 commented 5 months ago

Can you share the code you conducted the benchmarks on? Have you implemented bound-checks?? Since I am only getting comparable performance without bound-checking, what of course is dangerous.

dragon7307 commented 4 months ago

I have been unable to achieve said performance improvements.
However ,I will improve the performance by utilising vectorizationd (SIMD!).

draconware-dev / SpanExtensions.Net

Speed improvement #15

What?

Why?

How we could improve it?

Results of tests

Some explanation