[Perf] Windows/x64: 12 Regressions on 8/15/2024 12:39:36 AM

performanceautofiler[bot] commented 3 weeks ago

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Linq.Tests.Perf_Enumerable

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[WhereFirst_LastElementMatches - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Linq.Tests.Perf_Enumerable.WhereFirst_LastElementMatches(input%3a%20Array).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	54.14 ns	65.46 ns	1.21	0.35	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Linq.Tests.Perf_Enumerable*'

### System.Linq.Tests.Perf_Enumerable.WhereFirst_LastElementMatches(input: Array) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Numerics.Tests.Perf_VectorConvert

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[Convert_double_long - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Numerics.Tests.Perf_VectorConvert.Convert_double_long.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	473.78 ns	546.54 ns	1.15	0.12	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Numerics.Tests.Perf_VectorConvert*'

### System.Numerics.Tests.Perf_VectorConvert.Convert_double_long #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern%3a%20%22Huck%5ba-zA-Z%5d%2b%7cSaw%5ba-zA-Z%5d%2b%22%2c%20Options%3a%20Compiled).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	1.57 ms	1.67 ms	1.06	0.00	True
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern%3a%20%22(%5bA-Za-z%5dawyer%7c%5bA-Za-z%5dinn)%5c%5cs%22%2c%20Options%3a%20Compiled).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	12.95 ms	13.60 ms	1.05	0.00	True
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern%3a%20%22Tom%7cSawyer%7cHuckleberry%7cFinn%22%2c%20Options%3a%20NonBacktracking).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	3.68 ms	4.01 ms	1.09	0.02	False

graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig*'

### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Huck[a-zA-Z]+|Saw[a-zA-Z]+", Options: Compiled) #### ETL Files #### Histogram #### JIT Disasms ### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "([A-Za-z]awyer|[A-Za-z]inn)\\s", Options: Compiled) #### ETL Files #### Histogram #### JIT Disasms ### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Tom|Sawyer|Huckleberry|Finn", Options: NonBacktracking) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Tests.Perf_String

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[IndexerCheckBoundCheckHoist - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Tests.Perf_String.IndexerCheckBoundCheckHoist.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	23.52 ns	38.56 ns	1.64	0.02	False
[IndexerCheckLengthHoisting - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Tests.Perf_String.IndexerCheckLengthHoisting.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	23.51 ns	38.57 ns	1.64	0.02	False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_String*'

### System.Tests.Perf_String.IndexerCheckBoundCheckHoist #### ETL Files #### Histogram #### JIT Disasms ### System.Tests.Perf_String.IndexerCheckLengthHoisting #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in Loops.StrengthReduction

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector
[SumS29Span - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/Loops.StrengthReduction.SumS29Span.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	4.29 μs	5.43 μs	1.27	0.04	False
[SumS8Span - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/Loops.StrengthReduction.SumS8Span.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	4.33 μs	5.43 μs	1.25	0.04	False
[SumLongsSpan - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/Loops.StrengthReduction.SumLongsSpan.html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	4.31 μs	5.42 μs	1.26	0.04	False

graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Loops.StrengthReduction*'

### Loops.StrengthReduction.SumS29Span #### ETL Files #### Histogram #### JIT Disasms ### Loops.StrengthReduction.SumS8Span #### ETL Files #### Histogram #### JIT Disasms ### Loops.StrengthReduction.SumLongsSpan #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Collections.IterateFor<Int32>

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[ImmutableList - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Collections.IterateFor(Int32).ImmutableList(Size%3a%20512).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	3.41 μs	3.68 μs	1.08	0.26	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.IterateFor&lt;Int32&gt;*'

### System.Collections.IterateFor<Int32>.ImmutableList(Size: 512) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name	Value
Architecture	x64
OS	Windows 10.0.22631
Queue	ViperWindows
Baseline	0fbd81404d1f211572387498474063bc6f407f0f
Compare	bfffd58eeb204d368989038a19786bff86000b19
Diff	Diff
Configs	CompilationMode:tiered, RunKind:micro

Regressions in System.Globalization.Tests.StringSearch

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio
[LastIndexOf_Word_NotFound - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.22631/ViperWindows/System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options%3a%20(en-US%2c%20OrdinalIgnoreCase%2c%20False)).html>) 📝 - Benchmark Source ADX - Test Multi Config Graph	442.54 ns	616.75 ns	1.39	0.04	False

graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Globalization.Tests.StringSearch*'

### System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False)) #### ETL Files #### Histogram #### JIT Disasms ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

DrewScoggins commented 3 weeks ago

Could be related to https://github.com/dotnet/runtime/pull/106218

DrewScoggins commented 3 weeks ago

@Ruihan-Yin

dotnet-policy-service[bot] commented 3 weeks ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

DeepakRajendrakumaran commented 3 weeks ago

@Ruihan-Yin

Ruihan is OOO for the week. So, I took a look at this but unfortunately I'm not able to reproduce this locally

The steps I took are following

Step 1 : Checked out to the base commit mentioned(up to https://github.com/dotnet/runtime/commit/0fbd81404d1f211572387498474063bc6f407f0f) and build repo using build.cmd -c Release

Step 2 : Checked out to the diff commit mentioned(up to https://github.com/dotnet/runtime/commit/bfffd58eeb204d368989038a19786bff86000b19) and build repo using build.cmd -c Release

Step3 : Update performance repo and run tests using following command performance\src\benchmarks\micro>dotnet run -c Release -f net9.0 --filter "System.Linq.Tests.Perf_Enumerable*" --coreRun "<runtime_repo>\artifacts_106706_base\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\corerun.exe" "<runtime_repo>\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\corerun.exe"

I did the same for Loops.StrengthReduction* and System.Numerics.Tests.Perf_VectorConvert* as well. But the specified tests do not show any regression.

I do have a few questions though

What machine were these tests run on? I'm running on a cascade lakes machine
Do I need to set any env variables before running the tests?
Have you done a binary search and verified it's this commit?

tannergooding commented 3 weeks ago

What machine were these tests run on? I'm running on a cascade lakes machine Do I need to set any env variables before running the tests?

@DrewScoggins, could you share which CPU this is using in particular?

Have you done a binary search and verified it's this commit?

No confirmation was done to isolate the exact commit or check disassembly, triage typically just calls out the most likely commit from the range surrounding the regression.

-- In this particular case, I don't think there's much to do here outside validating which commit did introduce the regression. This was a correctness fix and we're late in the cycle, so at best we could confirm that it was #106218 and see if there's some optimization we could do in .NET 10 to restore the codegen to what it was. Perhaps some cases can still use embedded broadcast/mask or have an alternative sequence we could emit for example.

AndyAyersMS commented 3 weeks ago

Viper would be an AMD Zen4, right?

DrewScoggins commented 2 weeks ago

Here is a link to the page where we keep some basic hardware information on all of our queues. https://perfsupport.azurewebsites.net/hw-spec

kunalspathak commented 1 week ago

I am guessing this has something to do with dependencies updates...When it regressed, we had https://github.com/dotnet/runtime/compare/b1968e7aa8d56a088e8be6817d5240fb345f901c...cef3898f4d74eeee3d261e4fc5b77c78ec8cc9bf that contained https://github.com/dotnet/runtime/pull/106261. After a week, with #106421 in https://github.com/dotnet/runtime/compare/f402418aaed508c1d77e41b942e3978675183bfc...df2d2131e214a6cf4fa4774522c95370e3585040, the regression disappeared:

dotnet / runtime