dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.48k stars 4.77k forks source link

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_* #75549

Open performanceautofiler[bot] opened 2 years ago

performanceautofiler[bot] commented 2 years ago

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 33704a55be63d87a2048ba6fcd047d4296a39e8e
Compare 1f6ebd011d5be79cc545d3358774d465d91ab9b4
Diff Diff

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern%3a%20%22%5ba-zA-Z%5d%2bing%22%2c%20Options%3a%20NonBacktracking).html>) 5.57 ms 5.93 ms 1.06 0.03 False
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern%3a%20%22%5cw%2b%5cs%2bHolmes%22%2c%20Options%3a%20NonBacktracking).html>) 2.62 ms 2.87 ms 1.09 0.11 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 5.9323080952380955 > 5.886711663636364. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -19.274573056886705 (T) = (0 -5923687.616698998) / Math.Sqrt((4388914697.712178 / (39)) + (2144800438.9361887 / (19))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (39) + (19) - 2, .025) and -0.05136196025462187 = (5634298.976600202 - 5923687.616698998) / 5634298.976600202 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.8697342911877395 > 2.7478558045504387. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -11.688071817421484 (T) = (0 -2858463.4561130386) / Math.Sqrt((8931490306.078787 / (38)) + (1186410187.4800456 / (19))) is less than -2.0040447832881556 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (19) - 2, .025) and -0.07587567350346362 = (2656871.538701852 - 2858463.4561130386) / 2656871.538701852 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Architecture x64
OS Windows 10.0.18362
Baseline 33704a55be63d87a2048ba6fcd047d4296a39e8e
Compare 1f6ebd011d5be79cc545d3358774d465d91ab9b4
Diff Diff

Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[Count - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_Windows 10.0.18362/System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern%3a%20%22.%7b0%2c2%7d(Tom%7cSawyer%7cHuckleberry%7cFinn)%22%2c%20Options%3a%20NonBacktracking).html>) 63.91 ms 69.18 ms 1.08 0.07 False

graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom|Sawyer|Huckleberry|Finn)", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 69.18343666666667 > 67.36555787500001. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -7.405889112021537 (T) = (0 -71358972.79352225) / Math.Sqrt((4274791567200.788 / (38)) + (14893971995314.395 / (19))) is less than -2.0040447832881556 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (19) - 2, .025) and -0.1089670575853069 = (64347243.054181516 - 71358972.79352225) / 64347243.054181516 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in area-owners.md if you want to be subscribed.

Issue Details
### Run Information Architecture | x64 -- | -- OS | Windows 10.0.18362 Baseline | [33704a55be63d87a2048ba6fcd047d4296a39e8e](https://github.com/dotnet/runtime/commit/33704a55be63d87a2048ba6fcd047d4296a39e8e) Compare | [1f6ebd011d5be79cc545d3358774d465d91ab9b4](https://github.com/dotnet/runtime/commit/1f6ebd011d5be79cc545d3358774d465d91ab9b4) Diff | [Diff](https://github.com/dotnet/runtime/compare/33704a55be63d87a2048ba6fcd047d4296a39e8e...1f6ebd011d5be79cc545d3358774d465d91ab9b4) ### Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [Count - Duration of single invocation]() | 5.57 ms | 5.93 ms | 1.06 | 0.03 | False | | | [Count - Duration of single invocation]() | 2.62 ms | 2.87 ms | 1.09 | 0.11 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "[a-zA-Z]+ing", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 5.9323080952380955 > 5.886711663636364. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -19.274573056886705 (T) = (0 -5923687.616698998) / Math.Sqrt((4388914697.712178 / (39)) + (2144800438.9361887 / (19))) is less than -2.0032407188469383 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (39) + (19) - 2, .025) and -0.05136196025462187 = (5634298.976600202 - 5923687.616698998) / 5634298.976600202 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.8697342911877395 > 2.7478558045504387. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -11.688071817421484 (T) = (0 -2858463.4561130386) / Math.Sqrt((8931490306.078787 / (38)) + (1186410187.4800456 / (19))) is less than -2.0040447832881556 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (19) - 2, .025) and -0.07587567350346362 = (2656871.538701852 - 2858463.4561130386) / 2656871.538701852 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
### Run Information Architecture | x64 -- | -- OS | Windows 10.0.18362 Baseline | [33704a55be63d87a2048ba6fcd047d4296a39e8e](https://github.com/dotnet/runtime/commit/33704a55be63d87a2048ba6fcd047d4296a39e8e) Compare | [1f6ebd011d5be79cc545d3358774d465d91ab9b4](https://github.com/dotnet/runtime/commit/1f6ebd011d5be79cc545d3358774d465d91ab9b4) Diff | [Diff](https://github.com/dotnet/runtime/compare/33704a55be63d87a2048ba6fcd047d4296a39e8e...1f6ebd011d5be79cc545d3358774d465d91ab9b4) ### Regressions in System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [Count - Duration of single invocation]() | 63.91 ms | 69.18 ms | 1.08 | 0.07 | False | | | ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom|Sawyer|Huckleberry|Finn)", Options: NonBacktracking) ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 69.18343666666667 > 67.36555787500001. IsChangePoint: Marked as a change because one of 7/15/2022 5:00:56 PM, 9/8/2022 6:17:28 PM, 9/13/2022 2:30:06 AM falls between 9/4/2022 8:45:57 AM and 9/13/2022 2:30:06 AM. IsRegressionStdDev: Marked as regression because -7.405889112021537 (T) = (0 -71358972.79352225) / Math.Sqrt((4274791567200.788 / (38)) + (14893971995314.395 / (19))) is less than -2.0040447832881556 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (38) + (19) - 2, .025) and -0.1089670575853069 = (64347243.054181516 - 71358972.79352225) / 64347243.054181516 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
Author: performanceautofiler[bot]
Assignees: EgorBo
Labels: `area-System.Text.RegularExpressions`
Milestone: -
EgorBo commented 2 years ago

Regressed from https://github.com/dotnet/runtime/pull/74525 cc @stephentoub @olsaarik

stephentoub commented 2 years ago

@olsaarik, @joperezr, is there anything we can do about this or were we just enjoying false benefits due to the bug that this fixed?

joperezr commented 2 years ago

I can take a closer look, but it does look like this (small) regression was just false benefits from not running the match all the way through.

DrewScoggins commented 1 year ago

Also seeing a regression in this test.

https://pvscmdupload.blob.core.windows.net/reports/allTestHistory%2frefs%2fheads%2fmain_x86_Windows%2010.0.18362%2fSystem.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern%3a%20%22.%7b2%2c4%7d(Tom%7cSawyer%7cHuckleberry%7cFinn)%22%2c%20Options%3a%20NonBacktracking).html