dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.14k stars 4.71k forks source link

Regressions from simple forward substitution pass #64989

Closed performanceautofiler[bot] closed 2 years ago

performanceautofiler[bot] commented 2 years ago

Status

See scoreboard (below) for details on all the regressions reported here and in linked issues.

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Memory.Span<Byte>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[IndexOfValue - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Memory.Span(Byte).IndexOfValue(Size%3a%20512).html>) 7.48 ns 11.09 ns 1.48 0.60 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Memory.Span&lt;Byte&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Memory.Span<Byte>.IndexOfValue(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 11.085147614122393 > 7.862210160508238. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -17.00911942763199 (T) = (0 -11.293185216470434) / Math.Sqrt((0.4856141221968675 / (55)) + (1.0088142348260798 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.5284132866530846 = (7.388829523459732 - 11.293185216470434) / 7.388829523459732 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Memory.Span<Int32>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[StartsWith - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Memory.Span(Int32).StartsWith(Size%3a%20512).html>) 16.17 ns 23.16 ns 1.43 0.07 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Memory.Span&lt;Int32&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Memory.Span<Int32>.StartsWith(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 23.164866167123595 > 16.835331146004577. IsChangePoint: Marked as a change because one of 1/25/2022 4:50:09 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -12.72084176967564 (T) = (0 -22.00645069365951) / Math.Sqrt((0.8896820664330909 / (55)) + (3.9572562379688723 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.3347400926804468 = (16.487442622230517 - 22.00645069365951) / 16.487442622230517 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Buffers.Text.Tests.Utf8ParserTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[TryParseDateTimeOffset - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset(value%3a%2012%2f30%2f2017%203%3a45%3a22%20AM%20-08%3a00).html>) 9.34 ns 10.77 ns 1.15 0.03 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Buffers.Text.Tests.Utf8ParserTests*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset(value: 12/30/2017 3:45:22 AM -08:00) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 10.771670504772652 > 9.810891348350939. IsChangePoint: Marked as a change because one of 12/7/2021 2:32:08 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -81.78263602291406 (T) = (0 -10.784544915500774) / Math.Sqrt((0.018074601767252133 / (55)) + (0.00024223721686166197 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.16232696139283564 = (9.278408979326674 - 10.784544915500774) / 9.278408979326674 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Collections.AddGivenSize<String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[Dictionary - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Collections.AddGivenSize(String).Dictionary(Size%3a%20512).html>) 10.59 μs 12.27 μs 1.16 0.01 True

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.AddGivenSize&lt;String&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Collections.AddGivenSize<String>.Dictionary(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 12.268180396685011 > 11.272430324504915. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -43.86669863337481 (T) = (0 -12037.246022647732) / Math.Sqrt((20169.59953723831 / (55)) + (10148.165516652267 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.11555525082824997 = (10790.362927977449 - 12037.246022647732) / 10790.362927977449 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Globalization.Tests.StringSearch

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[IsSuffix_SecondHalf - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Globalization.Tests.StringSearch.IsSuffix_SecondHalf(Options%3a%20(en-US%2c%20Ordinal%2c%20False)).html>) 8.35 ns 9.97 ns 1.19 0.04 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Globalization.Tests.StringSearch*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Globalization.Tests.StringSearch.IsSuffix_SecondHalf(Options: (en-US, Ordinal, False)) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 9.967615474099448 > 8.884278925989152. IsChangePoint: Marked as a change because one of 1/25/2022 4:50:09 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -23.71682915669581 (T) = (0 -9.794788816757285) / Math.Sqrt((0.038367554435579515 / (55)) + (0.03969470060896875 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.13533229012730502 = (8.627244113403128 - 9.794788816757285) / 8.627244113403128 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Text.Json.Tests.Perf_Basic

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[WriteBasicUtf16 - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Text.Json.Tests.Perf_Basic.WriteBasicUtf16(Formatted%3a%20False%2c%20SkipValidation%3a%20False%2c%20DataSize%3a%20100000).html>) 1.52 ms 1.73 ms 1.14 0.07 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Tests.Perf_Basic*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Tests.Perf_Basic.WriteBasicUtf16(Formatted: False, SkipValidation: False, DataSize: 100000) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.732647520138889 > 1.6872304220372598. IsChangePoint: Marked as a change because one of 1/21/2022 11:24:10 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -21.656021748845443 (T) = (0 -1741917.4317416963) / Math.Sqrt((1607416812.114774 / (55)) + (127412484.86469544 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.07910199435011984 = (1614228.7206046279 - 1741917.4317416963) / 1614228.7206046279 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 50e6f078bcf4aa5bca1e7e110b9a6cb846390b82
Compare 607dccee09d06ef3f0d4c015da39fcc81117b4eb
Diff Diff

Regressions in System.Collections.CreateAddAndClear<String>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[IDictionary - Duration of single invocation](<https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04/System.Collections.CreateAddAndClear(String).IDictionary(Size%3a%20512).html>) 16.12 μs 17.72 μs 1.10 0.00 True

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.CreateAddAndClear&lt;String&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Collections.CreateAddAndClear<String>.IDictionary(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 17.715596582581735 > 16.6983248946314. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -38.871281755258785 (T) = (0 -17407.502138332035) / Math.Sqrt((26497.84614548045 / (55)) + (24881.06091812818 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.09684948356417755 = (15870.456611573458 - 17407.502138332035) / 15870.456611573458 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
dotnet-issue-labeler[bot] commented 2 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

kunalspathak commented 2 years ago

From https://github.com/dotnet/runtime/pull/63720

ghost commented 2 years ago

Tagging subscribers to this area: @JulieLeeMSFT See info in area-owners.md if you want to be subscribed.

Issue Details
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Memory.Span<Byte> Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [IndexOfValue - Duration of single invocation]() | 7.48 ns | 11.09 ns | 1.48 | 0.60 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Memory.Span<Byte>*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Memory.Span<Byte>.IndexOfValue(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 11.085147614122393 > 7.862210160508238. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -17.00911942763199 (T) = (0 -11.293185216470434) / Math.Sqrt((0.4856141221968675 / (55)) + (1.0088142348260798 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.5284132866530846 = (7.388829523459732 - 11.293185216470434) / 7.388829523459732 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Memory.Span<Int32> Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [StartsWith - Duration of single invocation]() | 16.17 ns | 23.16 ns | 1.43 | 0.07 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Memory.Span<Int32>*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Memory.Span<Int32>.StartsWith(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 23.164866167123595 > 16.835331146004577. IsChangePoint: Marked as a change because one of 1/25/2022 4:50:09 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -12.72084176967564 (T) = (0 -22.00645069365951) / Math.Sqrt((0.8896820664330909 / (55)) + (3.9572562379688723 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.3347400926804468 = (16.487442622230517 - 22.00645069365951) / 16.487442622230517 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Buffers.Text.Tests.Utf8ParserTests Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [TryParseDateTimeOffset - Duration of single invocation]() | 9.34 ns | 10.77 ns | 1.15 | 0.03 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Buffers.Text.Tests.Utf8ParserTests*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset(value: 12/30/2017 3:45:22 AM -08:00) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 10.771670504772652 > 9.810891348350939. IsChangePoint: Marked as a change because one of 12/7/2021 2:32:08 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -81.78263602291406 (T) = (0 -10.784544915500774) / Math.Sqrt((0.018074601767252133 / (55)) + (0.00024223721686166197 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.16232696139283564 = (9.278408979326674 - 10.784544915500774) / 9.278408979326674 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Collections.AddGivenSize<String> Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [Dictionary - Duration of single invocation]() | 10.59 μs | 12.27 μs | 1.16 | 0.01 | True | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.AddGivenSize<String>*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Collections.AddGivenSize<String>.Dictionary(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 12.268180396685011 > 11.272430324504915. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -43.86669863337481 (T) = (0 -12037.246022647732) / Math.Sqrt((20169.59953723831 / (55)) + (10148.165516652267 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.11555525082824997 = (10790.362927977449 - 12037.246022647732) / 10790.362927977449 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Globalization.Tests.StringSearch Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [IsSuffix_SecondHalf - Duration of single invocation]() | 8.35 ns | 9.97 ns | 1.19 | 0.04 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Globalization.Tests.StringSearch*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Globalization.Tests.StringSearch.IsSuffix_SecondHalf(Options: (en-US, Ordinal, False)) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 9.967615474099448 > 8.884278925989152. IsChangePoint: Marked as a change because one of 1/25/2022 4:50:09 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -23.71682915669581 (T) = (0 -9.794788816757285) / Math.Sqrt((0.038367554435579515 / (55)) + (0.03969470060896875 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.13533229012730502 = (8.627244113403128 - 9.794788816757285) / 8.627244113403128 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Text.Json.Tests.Perf_Basic Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [WriteBasicUtf16 - Duration of single invocation]() | 1.52 ms | 1.73 ms | 1.14 | 0.07 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.Json.Tests.Perf_Basic*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Tests.Perf_Basic.WriteBasicUtf16(Formatted: False, SkipValidation: False, DataSize: 100000) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.732647520138889 > 1.6872304220372598. IsChangePoint: Marked as a change because one of 1/21/2022 11:24:10 PM, 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -21.656021748845443 (T) = (0 -1741917.4317416963) / Math.Sqrt((1607416812.114774 / (55)) + (127412484.86469544 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.07910199435011984 = (1614228.7206046279 - 1741917.4317416963) / 1614228.7206046279 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | ubuntu 18.04 Baseline | [50e6f078bcf4aa5bca1e7e110b9a6cb846390b82](https://github.com/dotnet/runtime/commit/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82) Compare | [607dccee09d06ef3f0d4c015da39fcc81117b4eb](https://github.com/dotnet/runtime/commit/607dccee09d06ef3f0d4c015da39fcc81117b4eb) Diff | [Diff](https://github.com/dotnet/runtime/compare/50e6f078bcf4aa5bca1e7e110b9a6cb846390b82...607dccee09d06ef3f0d4c015da39fcc81117b4eb) ### Regressions in System.Collections.CreateAddAndClear<String> Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [IDictionary - Duration of single invocation]() | 16.12 μs | 17.72 μs | 1.10 | 0.00 | True | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git python3 .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Collections.CreateAddAndClear<String>*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Collections.CreateAddAndClear<String>.IDictionary(Size: 512) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 17.715596582581735 > 16.6983248946314. IsChangePoint: Marked as a change because one of 2/3/2022 10:14:29 PM, 2/8/2022 12:37:33 AM falls between 1/30/2022 3:20:30 AM and 2/8/2022 12:37:33 AM. IsRegressionStdDev: Marked as regression because -38.871281755258785 (T) = (0 -17407.502138332035) / Math.Sqrt((26497.84614548045 / (55)) + (24881.06091812818 / (23))) is less than -1.9916726096441384 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (55) + (23) - 2, .025) and -0.09684948356417755 = (15870.456611573458 - 17407.502138332035) / 15870.456611573458 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
Author: performanceautofiler[bot]
Assignees: -
Labels: `tenet-performance`, `tenet-performance-benchmarks`, `area-CodeGen-coreclr`, `refs/heads/main`, `x64`, `ubuntu 18.04`, `Regression`, `RunKind=micro`, `CoreClr`
Milestone: -
AndyAyersMS commented 2 years ago

A few detail charts from this issue:

newplot (95) newplot (96) newplot (97) newplot (98)

And subset of similar charts from the other regressions tagged here.

https://github.com/dotnet/perf-autofiling-issues/issues/3354 (ubuntu x64) newplot (89)

https://github.com/dotnet/perf-autofiling-issues/issues/3364 (windows x64; 7 tests) newplot (90)

https://github.com/dotnet/perf-autofiling-issues/issues/3378 (windows x86; 11tests) newplot (91)

https://github.com/dotnet/perf-autofiling-issues/issues/3379 (windows x86; 3 tests) newplot (92)

https://github.com/dotnet/perf-autofiling-issues/issues/3392 (windows x64) newplot (93)

https://github.com/dotnet/perf-autofiling-issues/issues/3406 (windows x64; 2 tests; full pgo) newplot (94)

AndyAyersMS commented 2 years ago

Started looking at some of the windows x64 issues.

For System.IO.Tests.Perf_FileStream.SeekForward one oddity is that it runs much faster when run as admin then under my user account. Profiling the "fast" admin variant shows a lot of time spent in the OS, I assume this overhead is even is worse for user mode.

@DrewScoggins do we run the perf tests on Windows as admin?

image

Local runs (as admin) don't show the regression Here after is 3fc8b099443a8b84bd158b8e49e38be6dd941180 and before is 0c3a6ee42276d9ba68954eb00f55c46ddf27fb39 (just before forward sub).

Method Toolchain fileSize options Mean Error StdDev Median
SeekForward before\corerun.exe 1024 None 31.08 us 0.407 us 0.381 us 31.03 us
SeekForward after\corerun.exe 1024 None 31.22 us 0.384 us 0.359 us 31.21 us
SeekForward before\corerun.exe 1024 Asynchronous 35.27 us 0.469 us 0.438 us 35.26 us
SeekForward after\corerun.exe 1024 Asynchronous 34.98 us 0.348 us 0.308 us 35.06 us

Running under my user account times are more like 55 us.

A 15%ish regression like the perf lab sees would mean a huge regression in some of the jitted methods (given jitted code is running for perhaps 30% of the overall benchmark interval). This kind of regression seems unlikely.

A wider view of the perf history shows this benchmark seems to be bimodal, but with long stretches of stability. newplot (100)

Similar things seem to apply for the other IO benchmarks reported above.

AndyAyersMS commented 2 years ago

The x86 regression from https://github.com/dotnet/perf-autofiling-issues/issues/3378 in System.Tests.Perf_Random.NextBytes_unseeded repros locally:

Method Toolchain Mean Error StdDev Median
NextBytes_unseeded before\corerun.exe 1.023 us 0.0084 us 0.0070 us 1.025 us
NextBytes_unseeded after\corerun.exe 1.260 us 0.0024 us 0.0020 us 1.259 us

Looks like all the time here is in one method:

image

Diff is an extra spilled CSE. Kind of pointless to CSE a load and then spill the value just loaded and reload the spill, of the original locationt hasn't been modified.

Note the other CSEs are also spilled. Would guess we are somewhat too aggressive with CSEs on x86, where there are so few allocatable registers.


;; BEFORE
;  V26 cse0         [V26,T14] (  3,  1.50)     int  ->  [ebp-2CH]   spill-single-def "CSE - conservative"
;  V27 cse1         [V27,T06] (  4,  6.50)     int  ->  [ebp-30H]   spill-single-def "CSE - aggressive"

       837D0C04     cmp      dword ptr [ebp+0CH], 4
       0F82B8000000 jb       G_M41150_IG10
       8B4508       mov      eax, bword ptr [ebp+08H]
       8945CC       mov      bword ptr [ebp-34H], eax
       8B450C       mov      eax, dword ptr [ebp+0CH]

;; AFTER
;  V26 cse0         [V26,T13] (  3,  1.50)     int  ->  [ebp-2CH]   spill-single-def "CSE - moderate"
;  V27 cse1         [V27,T06] (  4,  6.50)     int  ->  [ebp-30H]   spill-single-def "CSE - aggressive"
;  V28 cse2         [V28,T14] (  3,  1.50)     int  ->  [ebp-34H]   spill-single-def "CSE - moderate"

       8B450C       mov      eax, dword ptr [ebp+0CH]
       8945CC       mov      dword ptr [ebp-34H], eax
       83F804       cmp      eax, 4
       0F82B3000000 jb       G_M41150_IG10
       8B4508       mov      eax, bword ptr [ebp+08H]
       8945C8       mov      bword ptr [ebp-38H], eax
       8B45CC       mov      eax, dword ptr [ebp-34H]

Also somewhat oddly the checked runtime runs this test about 25% faster than release does. Guessing some of the checked corelib goo actually blocks us from these CSEs. Will look when I have time.

The diff here is just one fwd sub; V14 is removed in the after picture:

;; BEFORE

***** BB02
STMT00046 ( ??? ... ??? )
               [000218] -A---+------              *  ASG       int   
               [000217] D----+-N----              +--*  LCL_VAR   int    V14 tmp3         
               [000038] -----+------              \--*  MUL       int   
               [000210] -----+------                 +--*  ROL       int   
               [000198] -----+------                 |  +--*  LCL_VAR   int    V13 tmp2         
               [000201] -----+------                 |  \--*  CNS_INT   int    7
               [000037] -----+------                 \--*  CNS_INT   int    9

***** BB02
STMT00045 ( INL03 @ 0x000[E-] ... ??? ) <- INLRT @ ???
               [000216] -A-XG+------              *  ASG       int   
               [000215] U--X-+-N----              +--*  IND       int   
               [000035] -----+------              |  \--*  LCL_VAR   byref  V21 tmp10        
               [000214] -----+------              \--*  LCL_VAR   int    V14 tmp3

;; AFTER

***** BB02
STMT00045 ( INL03 @ 0x000[E-] ... ??? ) <- INLRT @ ???
               [000216] -A-XG+------              *  ASG       int   
               [000215] U--X-+-N----              +--*  IND       int   
               [000035] -----+------              |  \--*  LCL_VAR   byref  V21 tmp10        
               [000038] -----+------              \--*  MUL       int   
               [000210] -----+------                 +--*  ROL       int   
               [000198] -----+------                 |  +--*  LCL_VAR   int    V13 tmp2         
               [000201] -----+------                 |  \--*  CNS_INT   int    7
               [000037] -----+------                 \--*  CNS_INT   int    9

This ends up making CSE more aggressive, because it reduces the CSE-computed enregCount and hence reduces the moderate promotion cutoff:

;; BEFORE
Aggressive CSE Promotion cutoff is 800.000000
Moderate CSE Promotion cutoff is 200.000000
enregCount is 20
Framesize estimate is 0x002C
We have a small frame

Considering CSE #01 {$21a, $3  } [def=50.000000, use=50.000000, cost=  3      ]
CSE Expression : 
N001 (  3,  4) CSE #01 (def)[000236] ------------              *  LCL_FLD   int    V01 arg1         u:3[+4] Fseq[_length] $21a

Conservative CSE Promotion (not enregisterable) (150.000000 < 200.000000)          ;; * should print "enregisterable"
cseRefCnt=150.000000, aggressiveRefCnt=800.000000, moderateRefCnt=200.000000
defCnt=50.000000, useCnt=50.000000, cost=3, size=4
def_cost=2, use_cost=2, extra_no_cost=4, extra_yes_cost=0
CSE cost savings check (154.000000 >= 200.000000) fails

;; AFTER
Aggressive CSE Promotion cutoff is 800.000000
Moderate CSE Promotion cutoff is 150.000000
enregCount is 19
Framesize estimate is 0x0028
We have a small frame

Moderate CSE Promotion (CSE never live at call) (150.000000 >= 150.000000)
cseRefCnt=150.000000, aggressiveRefCnt=800.000000, moderateRefCnt=150.000000
defCnt=50.000000, useCnt=50.000000, cost=3, size=4
def_cost=2, use_cost=1, extra_no_cost=6, extra_yes_cost=0
CSE cost savings check (156.000000 >= 150.000000) passes

Promoting CSE:

lvaGrabTemp returning 28 (V28 rat0) (a long lifetime temp) called for CSE - moderate.
CSE #01 is single-def, so associated CSE temp V28 will be in SSA
New refCnts for V28: refCnt =  2, refCntWtd = 1   
New refCnts for V28: refCnt =  3, refCntWtd = 1.50

In reality nothing has changed, and we should not have altered our CSE choices.

So arguably this is a blemish in the CSE accounting; one can't simply add up the total number of tracked vars and get anything that's meaningful as far as register pressure goes -- you need to at least make some accounting for the patterns of def and use and interference.

Not clear there's any easy fix here. Possible remediations:

AndyAyersMS commented 2 years ago

The other interesting x86 regression from https://github.com/dotnet/perf-autofiling-issues/issues/3378 is System.Tests.Perf_DateTimeOffset.op_Subtraction newplot - 2022-02-09T180158 670

This also repros locally

Method Toolchain Mean Error StdDev Median
op_Subtraction after\corerun.exe 4.207 ns 0.0813 ns 0.0679 ns 4.236 ns
op_Subtraction before\corerun.exe 3.348 ns 0.0487 ns 0.0432 ns 3.341 ns

This test spends most of its time in the test method

image

Looking at the codegen:

; Perf_DateTimeOffset:op_Subtraction():TimeSpan:this

;; BEFORE

       8D74241C     lea      esi, bword ptr [esp+1CH]
       8B3E         mov      edi, dword ptr [esi]
       8B7604       mov      esi, dword ptr [esi+4]
       897C2408     mov      dword ptr [esp+08H], edi
       8974240C     mov      dword ptr [esp+0CH], esi
       8B742408     mov      esi, dword ptr [esp+08H]
       8B7C240C     mov      edi, dword ptr [esp+0CH]
       83E6FF       and      esi, -1
       81E7FFFFFF3F and      edi, 0x3FFFFFFF
       83CE00       or       esi, 0
       81CF00000040 or       edi, 0x40000000
       83E0FF       and      eax, -1
       81E1FFFFFF3F and      ecx, 0x3FFFFFFF

;; AFTER

       89442418     mov      dword ptr [esp+18H], eax
       894C241C     mov      dword ptr [esp+1CH], ecx
       8D44242C     lea      eax, bword ptr [esp+2CH]
       8B08         mov      ecx, dword ptr [eax]
       8B4004       mov      eax, dword ptr [eax+4]
       894C2410     mov      dword ptr [esp+10H], ecx
       89442414     mov      dword ptr [esp+14H], eax
       8B442410     mov      eax, dword ptr [esp+10H]
       8B4C2414     mov      ecx, dword ptr [esp+14H]
       83E0FF       and      eax, -1
       81E1FFFFFF3F and      ecx, 0x3FFFFFFF
       83C800       or       eax, 0
       81C900000040 or       ecx, 0x40000000
       89442408     mov      dword ptr [esp+08H], eax
       894C240C     mov      dword ptr [esp+0CH], ecx
       8B442418     mov      eax, dword ptr [esp+18H]
       8B4C241C     mov      ecx, dword ptr [esp+1CH]
       83E0FF       and      eax, -1
       81E1FFFFFF3F and      ecx, 0x3FFFFFFF
       8B742408     mov      esi, dword ptr [esp+08H]
       8B7C240C     mov      edi, dword ptr [esp+0CH]

Suspect forward sub is leading to sub-optimal long decomposition.

AndyAyersMS commented 2 years ago

Finding it hard to keep track of what's been reported and what's been investigated, so here's a scoreboard of sorts.

We have not seen the arm64 autofiling results yet, so expect this table to grow.

Regression Issue Arch/OS Status
System.Memory.Span.IndexOfValue(Size: 512) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu was regression, since improved

can't repro on old HW
notes
unclear on new HW
notes
System.Memory.Span.StartsWith(Size: 512) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu regression
jcc erratum
analysis
System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset(value: 12/30/2017 3:45:22 AM -08:000) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu was regression, since improved

can't repro on old HW
notes
can't repro on WSL2
notes
System.Collections.AddGivenSize.Dictionary(Size: 512) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu can't repro on old HW
notes
System.Globalization.Tests.StringSearch.IsSuffix_SecondHalf(Options: (en-US, Ordinal, False)).html) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu can't repro on old HW
notes
System.Text.Json.Tests.Perf_Basic.WriteBasicUtf16(Formatted: False, SkipValidation: False, DataSize: 100000) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu regression; jcc erratum analysis
System.Collections.CreateAddAndClear.IDictionary(Size: 512) https://github.com/dotnet/runtime/issues/64989 x64 Ubuntu can't repro on old HW
notes
System.Collections.Sort.List(Size: 512) https://github.com/dotnet/perf-autofiling-issues/issues/3354 x64 Ubuntu no regression; newly bimodal analysis
System.IO.Tests.Perf_FileStream.ReadByte(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.SeekForward(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.OpenClose(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_Directory.Exists https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_Directory.RecursiveCreateDeleteDirectory(depth: 100) https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
System.IO.Tests.Perf_Directory.CreateDirectory https://github.com/dotnet/perf-autofiling-issues/issues/3364 x64 Windows not regression; bimodal
Exceptions.Handling.ThrowAndCatchDeepRecursive(kind: ReflectionSoftware) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
Exceptions.Handling.ThrowAndCatchDeep(kind: ReflectionHardware) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
Exceptions.Handling.ThrowAndCatch(kind: ReflectionHardware) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.ReadByte(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.SeekForward(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.OpenClose(fileSize: 1024, options: Asynchronous) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.OpenClose(fileSize: 1024, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_Directory.Exists https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.IO.Tests.Perf_Directory.CreateDirectory https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows not regression; bimodal
System.Tests.Perf_Random.NextBytes_unseeded https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows regression analysis
System.Tests.Perf_DateTimeOffset.op_Subtraction https://github.com/dotnet/perf-autofiling-issues/issues/3378 x86 Windows regression analysis
System.IO.Tests.Perf_File.ReadAllLinesAsync https://github.com/dotnet/perf-autofiling-issues/issues/3379 x86 Windows not regression; bimodal
System.IO.Tests.Perf_File.Delete https://github.com/dotnet/perf-autofiling-issues/issues/3379 x86 Windows not regression; bimodal
System.Collections.CreateAddAndRemove.Stack(Size: 512) https://github.com/dotnet/perf-autofiling-issues/issues/3379 x86 Windows not regression; bimodal
System.IO.Tests.Perf_FileStream.GetLength(options: Write) https://github.com/dotnet/perf-autofiling-issues/issues/3392 x64 Windows not regression; bimodal
System.IO.Tests.Perf_RandomAccess.Read(fileSize: 1048576, bufferSize: 4096, options: None) https://github.com/dotnet/perf-autofiling-issues/issues/3392 x64 Windows not regression; bimodal
Exceptions.Handling.ThrowAndCatchDeepRecursive(kind: ReflectionHardware) https://github.com/dotnet/perf-autofiling-issues/issues/3392 x64 Windows not regression; bimodal
DrewScoggins commented 2 years ago

From @MattGal

linux/mac it's a user that has password-less sudo permissions but is not inherently root On Windows, UAC is turned down to the point where no dialogs can pop and the user is in the admin group

AndyAyersMS commented 2 years ago

Looks like forward sub alters behavior of System.Collections.Sort<IntStruct>.List(Size: 512) -- it was stable and is now bimodal; the bimodal "best" is an improvement, the "worst" a regression.

newplot - 2022-02-11T111115 071

AndyAyersMS commented 2 years ago

The first group of tests on the scoreboard (ubuntu x64) all look like real regressions, so will prioritize looking at those over the x86 regressions.

I'm not able to repro these on my venerable i7-2720QM (non-AVX2) -- generally the forward sub versions run faster.

System.Memory.Span.IndexOfValue

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio Allocated Alloc Ratio
IndexOfValue Job-SZAEDM after/corerun 512 17.01 ns 0.044 ns 0.039 ns 17.00 ns 16.98 ns 17.11 ns 1.00 - NA
IndexOfValue Job-YTWKTE before/corerun 512 22.01 ns 0.049 ns 0.043 ns 22.00 ns 21.95 ns 22.09 ns 1.29 - NA

System.Memory.Span.StartsWith

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio Allocated Alloc Ratio
StartsWith Job-CFHQCN after/corerun 512 18.67 ns 0.042 ns 0.037 ns 18.66 ns 18.62 ns 18.74 ns 1.00 - NA
StartsWith Job-WCMVZL before/corerun 512 19.49 ns 0.042 ns 0.037 ns 19.48 ns 19.43 ns 19.57 ns 1.04 - NA

System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset

Method Job Toolchain value Mean Error StdDev Median Min Max Ratio Allocated Alloc Ratio
TryParseDateTimeOffset Job-WQABYK after/corerun 12/30/2017 3:45:22 AM -08:00 15.94 ns 0.053 ns 0.050 ns 15.91 ns 15.87 ns 16.03 ns 1.00 - NA
TryParseDateTimeOffset Job-WGPIQB before/corerun 12/30/2017 3:45:22 AM -08:00 15.58 ns 0.083 ns 0.070 ns 15.55 ns 15.51 ns 15.71 ns 0.98 - NA

System.Collections.AddGivenSize.Dictionary

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Allocated Alloc Ratio
Dictionary Job-LALXSC after/corerun 512 19.80 us 0.312 us 0.292 us 19.67 us 19.53 us 20.30 us 1.00 0.00 4.6209 14.38 KB 1.00
Dictionary Job-IGUESQ before/corerun 512 20.23 us 0.082 us 0.077 us 20.22 us 20.13 us 20.35 us 1.02 0.02 4.6206 14.38 KB 1.00

System.Text.Json.Tests.Perf_Basic.WriteBasicUtf16(Formatted: False, SkipValidation: False, DataSize: 100000)

Method Job Toolchain Formatted SkipValidation DataSize Mean Error StdDev Median Min Max Ratio RatioSD Gen 0 Allocated Alloc Ratio
WriteBasicUtf16 Job-XSSZUU after/corerun False False 100000 2,777,998.8 ns 7,072.00 ns 6,269.14 ns 2,775,991.4 ns 2,769,001.1 ns 2,791,127.0 ns 1.00 0.00 - 129 B 1.00
WriteBasicUtf16 Job-XFHXFF before/corerun False False 100000 2,829,192.7 ns 8,179.81 ns 7,651.40 ns 2,830,676.5 ns 2,820,123.0 ns 2,844,626.1 ns 1.02 0.00 - 129 B 1.00

System.Collections.CreateAddAndClear.IDictionary

Method Job Toolchain Size Mean Error StdDev Median Min Max Ratio Gen 0 Allocated Alloc Ratio
IDictionary Job-AOAJQO after/corerun 512 29.98 us 0.085 us 0.071 us 29.97 us 29.90 us 30.11 us 1.00 15.2059 46.97 KB 1.00
IDictionary Job-NLSTVE before/corerun 512 30.09 us 0.200 us 0.167 us 30.06 us 29.93 us 30.50 us 1.00 15.2672 46.97 KB 1.00

Will look around for some more modern HW to use.

AndyAyersMS commented 2 years ago

Now using WSL2 on an Core i7-8700. I can repro the WriteBasicUtf16 slowdown:

Method Toolchain Formatted SkipValidation DataSize Mean Median Ratio
WriteBasicUtf16 after/corerun False False 100000 1.854 ms 1.853 ms 1.07
WriteBasicUtf16 before/corerun False False 100000 1.731 ms 1.727 ms 1.00

Profiling shows this should be dominated by the codegen in Utf8JsonWriter.WriteNumberValueMinimized(long) but this method has just one minimal diff on ubuntu x64:

;; BEFORE

G_M57270_IG02:              ;; offset=0019H
       488D7B38             lea      rdi, bword ptr [rbx+56]
       8B7F0C               mov      edi, dword ptr [rdi+12]

;; AFTER

G_M57270_IG02:              ;; offset=0019H
       8B7B44               mov      edi, dword ptr [rbx+68]

Nominally this diff is an improvement, however it has the unfortunate effect of mis-aligning the first two branches in this method, so that they are impacted by the JCC erratum:

;; BEFORE

G_M57270_IG02:              ;; offset=0019H
       488D7B38             lea      rdi, bword ptr [rbx+56]
       8B7F0C               mov      edi, dword ptr [rdi+12]
; ............................... 32B boundary ...............................
       2B7B2C               sub      edi, dword ptr [rbx+44]
       83FF15               cmp      edi, 21
       7D0D                 jge      SHORT G_M57270_IG04
                        ;; bbWeight=1    PerfScore 6.75
G_M57270_IG03:              ;; offset=0028H
       488BFB               mov      rdi, rbx
       BE15000000           mov      esi, 21
       E84B5AFFFF           call     Utf8JsonWriter:Grow(int):this
                        ;; bbWeight=0.50 PerfScore 0.75
G_M57270_IG04:              ;; offset=0035H
       4C8D7B38             lea      r15, bword ptr [rbx+56]
       4533E4               xor      r12d, r12d
       4533ED               xor      r13d, r13d
       498B3F               mov      rdi, gword ptr [r15]
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (mov: 2) 32B boundary ...............................
       4885FF               test     rdi, rdi
       743D                 je       SHORT G_M57270_IG07

;; AFTER

G_M57270_IG02:              ;; offset=0019H
       8B7B44               mov      edi, dword ptr [rbx+68]
       2B7B2C               sub      edi, dword ptr [rbx+44]
       83FF15               cmp      edi, 21
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (cmp: 2 ; jcc erratum) 32B boundary ...............................
       7D0D                 jge      SHORT G_M57270_IG04
                        ;; bbWeight=1    PerfScore 6.25
G_M57270_IG03:              ;; offset=0024H
       488BFB               mov      rdi, rbx
       BE15000000           mov      esi, 21
       E8475BFFFF           call     Utf8JsonWriter:Grow(int):this
                        ;; bbWeight=0.50 PerfScore 0.75
G_M57270_IG04:              ;; offset=0031H
       4C8D7B38             lea      r15, bword ptr [rbx+56]
       4533E4               xor      r12d, r12d
       4533ED               xor      r13d, r13d
       498B3F               mov      rdi, gword ptr [r15]
       4885FF               test     rdi, rdi
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (test: 1 ; jcc erratum) 32B boundary ...............................
       743D                 je       SHORT G_M57270_IG07
                        ;; bbWeight=1    PerfScore 4.25
G_M57270_IG05:              ;; offset=0043H

and my best guess is that this is what leads to the regression. Note the very similar code path on Windows did not regress, and I did not observe this regression (with likely very similar code; haven't verified) on older hardware.

Also note lower in this method there is a loop that we don't detect and don't align (either before or after) perhaps because it is profile cold:

;; AFTER
G_M57270_IG24:              ;; offset=01FAH
       488D4A30             lea      rcx, [rdx+48]
       49B9CDCCCCCCCCCCCCCC mov      r9, 0xCCCCCCCCCCCCCCCD
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (mov: 8) 32B boundary ...............................
       488BC2               mov      rax, rdx
       49F7E1               mul      rdx:rax, r9
       48C1EA03             shr      rdx, 3
       3BFE                 cmp      edi, esi
       0F8320030000         jae      G_M57270_IG59
       8BC7                 mov      eax, edi
       4C8D0C92             lea      r9, [rdx+4*rdx]
; ............................... 32B boundary ...............................
       4D03C9               add      r9, r9
       492BC9               sub      rcx, r9
       41880C00             mov      byte  ptr [r8+rax], cl
       FFCF                 dec      edi
       85FF                 test     edi, edi
       7FCA                 jg       SHORT G_M57270_IG24
                        ;; bbWeight=0    PerfScore 0.00

(4/29) this test is still showing regression newplot - 2022-04-29T123712 487

AndyAyersMS commented 2 years ago

System.Memory.Span<Byte>.IndexOfValue(Size: 512) has partially recovered some of the losses.

newplot - 2022-02-13T120517 637

I don't see any codegen diffs in the key method SpanHelpers:IndexOfValueType(byref,int,int):int. So perhaps data alignment?

(4/29) has improved a bit more.

newplot - 2022-04-29T123913 846

AndyAyersMS commented 2 years ago

I get erratic results locally for System.Memory.Span<Int32>.StartsWith -- here the key method is SpanHelpers:SequenceEqual. There are "good" diffs from forward sub that lead to shifting where the JCC errata hits.... suspect this is what leads to regressions. Not sure why performance fluctuates locally, though.

;; Assembly listing for method SpanHelpers:SequenceEqual(byref,byref,long):bool

;; BEFORE

G_M37173_IG12:              ;; offset=0082H
       C5FD100407           vmovupd  ymm0, ymmword ptr[rdi+rax]
       C5FD100C06           vmovupd  ymm1, ymmword ptr[rsi+rax]
       C5FD74C1             vpcmpeqb ymm0, ymm0, ymm1
       C5FDD7C8             vpmovmskb ecx, ymm0
       83F9FF               cmp      ecx, -1
       0F85A2000000         jne      G_M37173_IG18
       4883C020             add      rax, 32
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (add: 1) 32B boundary ...............................
       483BD0               cmp      rdx, rax
       77DC                 ja       SHORT G_M37173_IG12

;; AFTER

G_M37173_IG12:              ;; offset=0082H
       C5FD100407           vmovupd  ymm0, ymmword ptr[rdi+rax]
       C5FD740406           vpcmpeqb ymm0, ymm0, ymmword ptr[rsi+rax]
       C5FDD7C8             vpmovmskb ecx, ymm0
       83F9FF               cmp      ecx, -1
       0F8593000000         jne      G_M37173_IG18
       4883C020             add      rax, 32
       483BD0               cmp      rdx, rax
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (cmp: 0 ; jcc erratum) 32B boundary ...............................
       77E0                 ja       SHORT G_M37173_IG12

Perf lab shows some fluctuations too: newplot - 2022-02-13T160230 756

(4/29) perf is back to where it was newplot - 2022-04-29T124128 327

AndyAyersMS commented 2 years ago

I suspect System.Buffers.Text.Tests.Utf8ParserTests.TryParseDateTimeOffset(value: 12/30/2017 3:45:22 AM -08:00) does not run properly on WSL2 as the numbers I get are 15x higher than the lab (and don't show any regressions)

(update 4/29) this seems to have recovered all this missing perf newplot - 2022-04-29T123213 244

AndyAyersMS commented 2 years ago

Given the small number of residual regressions (just one on x64 that I can see) and the large number of improvements, I'm going to close this one.

Improvements: