dotnet / perf-autofiling-issues

A landing place for auto-filed performance issues before they receive triage
MIT License
9 stars 4 forks source link

[Perf] Changes at 11/17/2021 1:35:01 AM #2500

Closed performanceautofiler[bot] closed 1 year ago

performanceautofiler[bot] commented 2 years ago

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 3806e15595632e4bc8e4c4fb4cde12163006eca9
Compare 0666ebc475871c27f5b9d4ee8e91922f20be46e9
Diff Diff

Regressions in System.Linq.Tests.Perf_Enumerable

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[WhereLast_LastElementMatches - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Linq.Tests.Perf_Enumerable.WhereLast_LastElementMatches(input%3a%20Array).html>) 269.68 ns 297.12 ns 1.10 0.07 False
[Select - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Linq.Tests.Perf_Enumerable.Select(input%3a%20Array).html>) 704.16 ns 765.24 ns 1.09 0.05 False
[All_AllElementsMatch - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Linq.Tests.Perf_Enumerable.All_AllElementsMatch(input%3a%20IEnumerable).html>) 671.13 ns 712.40 ns 1.06 0.08 False

graph graph graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Linq.Tests.Perf_Enumerable*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Linq.Tests.Perf_Enumerable.WhereLast_LastElementMatches(input: Array) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 297.11856249024726 > 283.11194291599253. IsChangePoint: Marked as a change because one of 9/11/2021 4:06:00 PM, 10/28/2021 4:25:04 PM, 11/8/2021 12:05:30 PM, 11/17/2021 12:07:43 AM, 11/23/2021 3:51:55 AM falls between 11/14/2021 12:07:47 PM and 11/23/2021 3:51:55 AM. IsRegressionStdDev: Marked as regression because -153.24118214556543 (T) = (0 -298.04324004547976) / Math.Sqrt((0.4417034978114919 / (28)) + (0.6261683313043631 / (31))) is less than -2.0024654592901125 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (28) + (31) - 2, .025) and -0.10805711091134372 = (268.97822965131115 - 298.04324004547976) / 268.97822965131115 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.Linq.Tests.Perf_Enumerable.Select(input: Array) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 765.2432394514398 > 739.4988109993811. IsChangePoint: Marked as a change because one of 9/11/2021 4:06:00 PM, 9/17/2021 5:17:35 AM, 10/1/2021 10:57:04 AM, 10/8/2021 11:00:59 PM, 10/22/2021 2:37:38 PM, 11/11/2021 3:02:07 PM, 11/15/2021 11:16:33 AM, 11/23/2021 3:51:55 AM falls between 11/14/2021 12:07:47 PM and 11/23/2021 3:51:55 AM. IsRegressionStdDev: Marked as regression because -81.12239954331834 (T) = (0 -764.1478345992035) / Math.Sqrt((2.7501312859548337 / (18)) + (16.686873959455866 / (41))) is less than -2.0024654592901125 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (41) - 2, .025) and -0.08628097381621065 = (703.4532068758214 - 764.1478345992035) / 703.4532068758214 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.Linq.Tests.Perf_Enumerable.All_AllElementsMatch(input: IEnumerable) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 712.3995842046794 > 703.877737647265. IsChangePoint: Marked as a change because one of 9/17/2021 5:17:35 AM, 10/1/2021 7:37:07 AM, 11/8/2021 12:05:30 PM, 11/15/2021 11:16:33 AM, 11/23/2021 3:51:55 AM falls between 11/14/2021 12:07:47 PM and 11/23/2021 3:51:55 AM. IsRegressionStdDev: Marked as regression because -53.739345117492554 (T) = (0 -710.2459114858872) / Math.Sqrt((3.0473816606010766 / (18)) + (16.250029050255623 / (41))) is less than -2.0024654592901125 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (18) + (41) - 2, .025) and -0.06033911104903646 = (669.8290236443435 - 710.2459114858872) / 669.8290236443435 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 3806e15595632e4bc8e4c4fb4cde12163006eca9
Compare 0666ebc475871c27f5b9d4ee8e91922f20be46e9
Diff Diff

Regressions in PerfLabTests.LowLevelPerf

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[InterfaceInterfaceMethod - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/PerfLabTests.LowLevelPerf.InterfaceInterfaceMethod.html>) 3.74 ms 4.15 ms 1.11 0.02 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'PerfLabTests.LowLevelPerf*'
### Payloads [Baseline]() [Compare]() ### Histogram #### PerfLabTests.LowLevelPerf.InterfaceInterfaceMethod ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 4.147151897321429 > 3.92891876953125. IsChangePoint: Marked as a change because one of 9/17/2021 5:17:35 AM, 9/18/2021 12:26:04 AM, 11/17/2021 12:07:43 AM, 11/23/2021 3:51:55 AM falls between 11/14/2021 12:07:47 PM and 11/23/2021 3:51:55 AM. IsRegressionStdDev: Marked as regression because -23.114924076761902 (T) = (0 -4138476.9595695967) / Math.Sqrt((6863783945.15853 / (28)) + (48294636.44633618 / (25))) is less than -2.007583770314729 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (28) + (25) - 2, .025) and -0.09624225105015419 = (3775148.198863079 - 4138476.9595695967) / 3775148.198863079 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
AndyAyersMS commented 2 years ago

WhereLast_LastElementMatches is noisy but seemingly shifted up a bit on 10/15, possibly from: https://github.com/dotnet/runtime/pull/59602.

Not enough evidence there to warrant transfer, but cc @jakobbotsch in case you want to see if you can spot anything.

The others look like noise.

AndyAyersMS commented 2 years ago

Regression in WhereLast_LastElementMatches still there: newplot - 2022-07-19T110646 492

Still not clear what the root cause might be. The recent "stable" regime seems like it might be related to https://github.com/dotnet/runtime/pull/65738.

AndyAyersMS commented 1 year ago

Not sure where I got that chart above from, but latest results show we've never gotten back to the 260-270 levels we were at long ago

newplot - 2023-04-01T094027 590

performanceautofiler[bot] commented 1 year ago

Closing because it's stale.