dotnet / performance

This repo contains benchmarks used for testing the performance of all .NET Runtimes
MIT License
687 stars 267 forks source link

Add strength reduction benchmarks #4317

Closed jakobbotsch closed 1 month ago

jakobbotsch commented 1 month ago

This adds strength reduction benchmarks for arrays of a few different element sizes, motivated by the differences in codegen. The element sizes give different characteristics of how we access each element. For x64, the current instruction codegen looks like:

2: load 3: lea + load 4: load 8: load 12: lea + load 16: shl + load 29: imul + load

Each size has 3 variants of benchmarks: an array version, a span version, and a fully strength reduced manual version. The JIT is expected to be able to transform the array version into the strength reduced version soon. The span version will also be transformed, but not quite all the way (the strength reduction will not be able to fold in the base byref of the span).

There is one current annoyance to work around in the JIT: we do not align the strength-reduced versions of the loops because they end up being "too small", meaning that they still fit within a single cache line. However, it turns out alignment is still beneficial in these cases, and this skews the results compared to the non-strength reduced versions. I have opened https://github.com/dotnet/runtime/issues/104665 about this. To work around the problem in these benchmarks I have added a superfluous bitwise or operation in the body of all the loops.

On my Intel CPU the current results are: Method Mean Error StdDev Median Min Max Ratio RatioSD Code Size Allocated Alloc Ratio
SumS12Array 4.813 us 0.2538 us 0.2923 us 4.722 us 4.445 us 5.394 us 1.00 0.08 73 B - NA
SumS12Span 4.530 us 0.1372 us 0.1580 us 4.467 us 4.334 us 4.844 us 0.94 0.06 126 B - NA
SumS12ArrayStrengthReduced 3.712 us 0.0918 us 0.1058 us 3.653 us 3.608 us 3.913 us 0.77 0.05 65 B - NA
SumS16Array 4.557 us 0.0569 us 0.0532 us 4.544 us 4.482 us 4.677 us 1.00 0.02 73 B - NA
SumS16Span 4.529 us 0.0277 us 0.0259 us 4.532 us 4.479 us 4.572 us 0.99 0.01 126 B - NA
SumS16ArrayStrengthReduced 3.836 us 0.0349 us 0.0326 us 3.828 us 3.796 us 3.920 us 0.84 0.01 65 B - NA
SumS29Array 5.260 us 0.0623 us 0.0583 us 5.243 us 5.181 us 5.378 us 1.00 0.02 84 B - NA
SumS29Span 5.265 us 0.0474 us 0.0444 us 5.258 us 5.200 us 5.349 us 1.00 0.01 132 B - NA
SumS29ArrayStrengthReduced 4.340 us 0.0349 us 0.0326 us 4.339 us 4.282 us 4.384 us 0.83 0.01 76 B - NA
SumS3Array 4.324 us 0.0318 us 0.0298 us 4.321 us 4.287 us 4.392 us 1.00 0.01 74 B - NA
SumS3Span 4.352 us 0.0243 us 0.0228 us 4.360 us 4.301 us 4.382 us 1.01 0.01 127 B - NA
SumS3ArrayStrengthReduced 3.289 us 0.0195 us 0.0183 us 3.287 us 3.266 us 3.339 us 0.76 0.01 66 B - NA
SumS8Array 3.794 us 0.1145 us 0.1319 us 3.744 us 3.691 us 4.177 us 1.00 0.05 69 B - NA
SumS8Span 3.743 us 0.0213 us 0.0199 us 3.738 us 3.720 us 3.805 us 0.99 0.03 122 B - NA
SumS8ArrayStrengthReduced 3.435 us 0.0647 us 0.0719 us 3.425 us 3.346 us 3.713 us 0.91 0.03 65 B - NA
SumIntsArray 3.719 us 0.1488 us 0.1713 us 3.631 us 3.568 us 4.032 us 1.00 0.06 70 B - NA
SumIntsSpan 3.621 us 0.0188 us 0.0176 us 3.621 us 3.589 us 3.646 us 0.98 0.04 121 B - NA
SumIntsArrayStrengthReduced 3.447 us 0.2373 us 0.2733 us 3.338 us 3.286 us 4.516 us 0.93 0.08 65 B - NA
SumLongsArray 3.785 us 0.1116 us 0.1285 us 3.741 us 3.669 us 4.180 us 1.00 0.05 69 B - NA
SumLongsSpan 3.792 us 0.1443 us 0.1662 us 3.695 us 3.617 us 4.123 us 1.00 0.05 123 B - NA
SumLongsArrayStrengthReduced 3.454 us 0.0634 us 0.0593 us 3.443 us 3.402 us 3.656 us 0.91 0.03 65 B - NA
SumShortsArray 3.626 us 0.0810 us 0.0933 us 3.619 us 3.507 us 3.869 us 1.00 0.04 70 B - NA
SumShortsSpan 3.586 us 0.0416 us 0.0389 us 3.581 us 3.526 us 3.693 us 0.99 0.03 122 B - NA
SumShortsArrayStrengthReduced 3.317 us 0.0541 us 0.0506 us 3.302 us 3.263 us 3.442 us 0.92 0.03 66 B - NA
jakobbotsch commented 1 month ago

cc @LoopedBard3 @DrewScoggins

jakobbotsch commented 1 month ago

I don't think the CI failures are related.

LoopedBard3 commented 1 month ago

Yes, the current ci failures do not seem to be related, is this ready for review and merge?

jakobbotsch commented 1 month ago

Yes, the current ci failures do not seem to be related, is this ready for review and merge?

Yeah, this is ready.