dotnet / perf-autofiling-issues

A landing place for auto-filed performance issues before they receive triage
MIT License
9 stars 4 forks source link

[Perf] Linux/x64: 38 Regressions on 4/9/2023 11:24:15 AM #15699

Open performanceautofiler[bot] opened 1 year ago

performanceautofiler[bot] commented 1 year ago

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[SerializeToUtf8Bytes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToUtf8Bytes(Mode%3a%20Reflection).html>) 1.44 ms 1.73 ms 1.20 0.06 False
[SerializeToWriter - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToWriter(Mode%3a%20SourceGen).html>) 854.43 μs 1.10 ms 1.29 0.11 False
[SerializeToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToString(Mode%3a%20SourceGen).html>) 928.36 μs 1.20 ms 1.30 0.12 False
[SerializeObjectProperty - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeObjectProperty(Mode%3a%20SourceGen).html>) 917.53 μs 1.20 ms 1.30 0.10 False
[SerializeObjectProperty - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeObjectProperty(Mode%3a%20Reflection).html>) 1.47 ms 1.79 ms 1.21 0.09 False
[SerializeToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToString(Mode%3a%20Reflection).html>) 1.44 ms 1.82 ms 1.27 0.05 False
[SerializeToWriter - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToWriter(Mode%3a%20Reflection).html>) 1.41 ms 1.71 ms 1.21 0.08 False
[SerializeToUtf8Bytes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(MyEventsListerViewModel).SerializeToUtf8Bytes(Mode%3a%20SourceGen).html>) 889.05 μs 1.15 ms 1.29 0.08 False

graph graph graph graph graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Json.Serialization.Tests.WriteJson&lt;MyEventsListerViewModel&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToUtf8Bytes(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.7334409737103176 > 1.5106449220028406. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -19.30336757285662 (T) = (0 -1760532.8299248866) / Math.Sqrt((764249190.579451 / (43)) + (1911901496.1484249 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.23002579589065036 = (1431297.4864483236 - 1760532.8299248866) / 1431297.4864483236 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToWriter(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.1024863930555557 > 901.3729586266448. IsChangePoint: Marked as a change because one of 3/10/2023 11:41:11 AM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -24.259682282447997 (T) = (0 -1123007.4771714562) / Math.Sqrt((418743005.78988874 / (43)) + (628950525.1830385 / (6))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (6) - 2, .025) and -0.3007620807913441 = (863345.7984016974 - 1123007.4771714562) / 863345.7984016974 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToString(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.2038464321428572 > 965.5757336914063. IsChangePoint: Marked as a change because one of 2/13/2023 6:54:14 PM, 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -45.77574660563786 (T) = (0 -1182657.0629807692) / Math.Sqrt((439787795.285117 / (42)) + (154765105.23673412 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.2835820233641032 = (921372.4105306315 - 1182657.0629807692) / 921372.4105306315 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeObjectProperty(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.1962798625 > 960.3113865820311. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 3/10/2023 4:14:40 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -33.13704484439174 (T) = (0 -1187492.0079490452) / Math.Sqrt((307815688.6058221 / (43)) + (395338337.9171594 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.2863453064862063 = (923151.8177594243 - 1187492.0079490452) / 923151.8177594243 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeObjectProperty(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.7883439810185184 > 1.554131670706731. IsChangePoint: Marked as a change because one of 3/10/2023 4:14:40 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -16.830382284852913 (T) = (0 -1763616.5070400883) / Math.Sqrt((381766522.5330156 / (42)) + (2052467067.4214847 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.1989310502871279 = (1470990.7684997614 - 1763616.5070400883) / 1470990.7684997614 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToString(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.822725446986607 > 1.548973780778846. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -15.867382020776489 (T) = (0 -1796059.0779696947) / Math.Sqrt((622096772.6900558 / (42)) + (2879577254.4005203 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.2230645613881314 = (1468490.817795617 - 1796059.0779696947) / 1468490.817795617 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToWriter(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.7108328703703704 > 1.4829773012784087. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -31.906590112677556 (T) = (0 -1684949.1084665533) / Math.Sqrt((1117112246.5810547 / (42)) + (359277974.52884793 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.20070805174327352 = (1403296.2517576392 - 1684949.1084665533) / 1403296.2517576392 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<MyEventsListerViewModel>.SerializeToUtf8Bytes(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.148270192628205 > 935.54041484375. IsChangePoint: Marked as a change because one of 2/16/2023 3:28:45 AM, 3/10/2023 4:14:40 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -55.43342176315234 (T) = (0 -1144866.63070382) / Math.Sqrt((498266766.82401186 / (43)) + (62833427.953192644 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.281341170824307 = (893490.8647064772 - 1144866.63070382) / 893490.8647064772 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Tests.Perf_DateTime

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_DateTime.ToString(format%3a%20null).html>) 360.58 ns 599.99 ns 1.66 0.16 True
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_DateTime.ToString(format%3a%20%22o%22).html>) 130.53 ns 185.18 ns 1.42 0.25 False
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_DateTime.ToString(format%3a%20%22s%22).html>) 370.78 ns 610.66 ns 1.65 0.13 True
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Tests.Perf_DateTime.ToString(format%3a%20%22G%22).html>) 358.66 ns 569.28 ns 1.59 0.15 True

graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_DateTime*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Tests.Perf_DateTime.ToString(format: null) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 599.9888331743136 > 377.39721273111485. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -24.322655055724343 (T) = (0 -601.8257350283385) / Math.Sqrt((151.86787165916476 / (43)) + (668.1806228612865 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.6725035176669795 = (359.8352581451318 - 601.8257350283385) / 359.8352581451318 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Tests.Perf_DateTime.ToString(format: "o") ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 185.17668575963293 > 136.5719478509588. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -26.270893599250268 (T) = (0 -178.7206589105019) / Math.Sqrt((26.735841185961704 / (43)) + (21.936082818519164 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.3983300571957398 = (127.81006743780834 - 178.7206589105019) / 127.81006743780834 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Tests.Perf_DateTime.ToString(format: "s") ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 610.6603128857563 > 389.66698580220293. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -18.132431961749482 (T) = (0 -561.851156518624) / Math.Sqrt((150.855263664341 / (42)) + (702.6031308587686 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.49044704835091574 = (376.96821040390284 - 561.851156518624) / 376.96821040390284 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Tests.Perf_DateTime.ToString(format: "G") ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 569.2818855012036 > 363.7887397283544. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -24.932149334810706 (T) = (0 -601.256363345744) / Math.Sqrt((123.23113972028878 / (43)) + (659.2188534817575 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.6905721339501865 = (355.65259314954517 - 601.256363345744) / 355.65259314954517 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Json.Tests.Perf_DateTimes

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[WriteDateTimes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted%3a%20True%2c%20SkipValidation%3a%20False).html>) 8.46 ms 15.49 ms 1.83 0.09 True
[WriteDateTimes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted%3a%20False%2c%20SkipValidation%3a%20True).html>) 7.36 ms 15.01 ms 2.04 0.11 True
[WriteDateTimes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted%3a%20True%2c%20SkipValidation%3a%20True).html>) 8.37 ms 15.63 ms 1.87 0.06 True
[WriteDateTimes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted%3a%20False%2c%20SkipValidation%3a%20False).html>) 7.27 ms 15.21 ms 2.09 0.11 True

graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Json.Tests.Perf_DateTimes*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted: True, SkipValidation: False) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 15.494484431111115 > 8.8966080478125. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -18.580510834604535 (T) = (0 -16015703.269991279) / Math.Sqrt((101391066844.01012 / (42)) + (1096545253234.7323 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.8611475292152594 = (8605284.115625264 - 16015703.269991279) / 8605284.115625264 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted: False, SkipValidation: True) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 15.006017768888888 > 7.5158398423076935. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -66.5256048402972 (T) = (0 -15111659.094716337) / Math.Sqrt((61407129919.42048 / (42)) + (91044348727.34425 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -1.1255148638444077 = (7109646.397571625 - 15111659.094716337) / 7109646.397571625 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted: True, SkipValidation: True) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 15.629066666666668 > 8.826791964843752. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -25.617238810746446 (T) = (0 -16247207.360770974) / Math.Sqrt((14463057281.009233 / (43)) + (652324797797.6385 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.9312130303363065 = (8412954.503492368 - 16247207.360770974) / 8412954.503492368 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Text.Json.Tests.Perf_DateTimes.WriteDateTimes(Formatted: False, SkipValidation: False) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 15.213145820833333 > 7.70143438125. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -34.920054594525325 (T) = (0 -15340341.68857448) / Math.Sqrt((37287439335.00616 / (42)) + (371548600378.466 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -1.1222916700892094 = (7228196.719977541 - 15340341.68857448) / 7228196.719977541 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Tests.Perf_StringBuilder

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[Append_ValueTypes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Tests.Perf_StringBuilder.Append_ValueTypes.html>) 6.30 μs 6.82 μs 1.08 0.10 False
[Append_ValueTypes_Interpolated - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Tests.Perf_StringBuilder.Append_ValueTypes_Interpolated.html>) 8.45 μs 9.78 μs 1.16 0.12 False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Tests.Perf_StringBuilder*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Tests.Perf_StringBuilder.Append_ValueTypes ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 6.817022412823868 > 6.409512357608278. IsChangePoint: Marked as a change because one of 2/12/2023 3:00:19 AM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -10.694344103194501 (T) = (0 -6969.682413049682) / Math.Sqrt((19648.708733229687 / (43)) + (37512.4582635502 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.1325244710812947 = (6154.111978167918 - 6969.682413049682) / 6154.111978167918 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Tests.Perf_StringBuilder.Append_ValueTypes_Interpolated ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 9.778222395506385 > 8.870198687098206. IsChangePoint: Marked as a change because one of 2/9/2023 12:40:40 PM, 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -38.543168387420515 (T) = (0 -9749.466830629566) / Math.Sqrt((28780.18582165279 / (43)) + (3108.6513994967463 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.15196008518214554 = (8463.372087313252 - 9749.466830629566) / 8463.372087313252 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in MicroBenchmarks.Serializers.Json_ToStream<MyEventsListerViewModel>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[SystemTextJsonSourceGen - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/MicroBenchmarks.Serializers.Json_ToStream(MyEventsListerViewModel).SystemTextJsonSourceGen.html>) 1.30 ms 1.60 ms 1.23 0.07 False
[SystemTextJsonReflection - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/MicroBenchmarks.Serializers.Json_ToStream(MyEventsListerViewModel).SystemTextJsonReflection.html>) 1.38 ms 1.72 ms 1.25 0.06 False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'MicroBenchmarks.Serializers.Json_ToStream&lt;MyEventsListerViewModel&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### MicroBenchmarks.Serializers.Json_ToStream<MyEventsListerViewModel>.SystemTextJson_SourceGen_ ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.6022623655913977 > 1.3721657819213418. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -16.95924254476948 (T) = (0 -1600796.5826501288) / Math.Sqrt((1284416405.5164037 / (41)) + (1656159385.6116073 / (7))) is less than -2.0128955989180297 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (41) + (7) - 2, .025) and -0.20978955255254847 = (1323202.5183823006 - 1600796.5826501288) / 1323202.5183823006 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### MicroBenchmarks.Serializers.Json_ToStream<MyEventsListerViewModel>.SystemTextJson_Reflection_ ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.7246685162037043 > 1.4672714498997175. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -22.448551132325488 (T) = (0 -1691423.309335212) / Math.Sqrt((531703895.71895164 / (43)) + (1054205304.6076332 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.20398913058443383 = (1404849.3182942362 - 1691423.309335212) / 1404849.3182942362 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
performanceautofiler[bot] commented 1 year ago

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[SerializeToWriter - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToWriter(Mode%3a%20SourceGen).html>) 596.49 ns 669.53 ns 1.12 0.20 False
[SerializeToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToString(Mode%3a%20Reflection).html>) 943.74 ns 1.04 μs 1.10 0.25 False
[SerializeToUtf8Bytes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToUtf8Bytes(Mode%3a%20SourceGen).html>) 743.14 ns 889.25 ns 1.20 0.20 False
[SerializeToWriter - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToWriter(Mode%3a%20Reflection).html>) 564.19 ns 660.12 ns 1.17 0.18 False
[SerializeToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToString(Mode%3a%20SourceGen).html>) 889.06 ns 1.05 μs 1.18 0.22 False
[SerializeToUtf8Bytes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeToUtf8Bytes(Mode%3a%20Reflection).html>) 773.45 ns 843.16 ns 1.09 0.23 False
[SerializeObjectProperty - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(Nullable(DateTimeOffset)).SerializeObjectProperty(Mode%3a%20SourceGen).html>) 1.45 μs 1.66 μs 1.14 0.25 False

graph_1.png>) graph_2.png>) graph_3.png>) graph_4.png>) Test Report.html>)

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Json.Serialization.Tests.WriteJson&lt;Nullable&lt;DateTimeOffset&gt;&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToWriter(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 669.527917667613 > 596.9572439283145. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -18.771343205310938 (T) = (0 -667.3348530344057) / Math.Sqrt((254.69592280282856 / (43)) + (173.83269771302704 / (6))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (6) - 2, .025) and -0.19927545798075977 = (556.448352706232 - 667.3348530344057) / 556.448352706232 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToString(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.0409540723651711 > 960.3639619819999. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -14.965959027110165 (T) = (0 -1042.5649215164249) / Math.Sqrt((939.6907994466882 / (43)) + (306.034038676819 / (6))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (6) - 2, .025) and -0.13964026898846696 = (914.8193073607296 - 1042.5649215164249) / 914.8193073607296 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToUtf8Bytes(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 889.248148732966 > 784.6956055090557. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -16.178954668021497 (T) = (0 -884.3877778185087) / Math.Sqrt((493.35994200494923 / (43)) + (364.40959673864944 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.17070740385640967 = (755.4302423519832 - 884.3877778185087) / 755.4302423519832 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToWriter(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 660.1159389299919 > 596.0854349810462. IsChangePoint: Marked as a change because one of 3/13/2023 8:52:40 AM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -17.52887284561596 (T) = (0 -665.5075915520725) / Math.Sqrt((349.4172701304029 / (43)) + (201.12870822786022 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.1903460961364246 = (559.0874735609661 - 665.5075915520725) / 559.0874735609661 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToString(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.0456139683557115 > 933.6215003256035. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -10.959092994261272 (T) = (0 -1064.715372952346) / Math.Sqrt((802.894219685747 / (43)) + (1336.0516474424942 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.17508085541800214 = (906.0783928554458 - 1064.715372952346) / 906.0783928554458 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeToUtf8Bytes(Mode: Reflection) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 843.1623146075597 > 805.9544086248478. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -8.737477700212265 (T) = (0 -881.8324312037756) / Math.Sqrt((290.62529696233787 / (43)) + (1396.922050362772 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.16593766722221098 = (756.3289668003418 - 881.8324312037756) / 756.3289668003418 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<Nullable<DateTimeOffset>>.SerializeObjectProperty(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.657129068055832 > 1.5696686159442366. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -7.652353630095842 (T) = (0 -1688.1028963244996) / Math.Sqrt((1787.0238810917895 / (42)) + (3001.009135414357 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.10914851990090095 = (1521.980930448635 - 1688.1028963244996) / 1521.980930448635 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in MicroBenchmarks.Serializers.Json_ToString<MyEventsListerViewModel>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[SystemTextJsonSourceGen - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/MicroBenchmarks.Serializers.Json_ToString(MyEventsListerViewModel).SystemTextJsonSourceGen.html>) 907.61 μs 1.15 ms 1.27 0.08 False
[SystemTextJsonReflection - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/MicroBenchmarks.Serializers.Json_ToString(MyEventsListerViewModel).SystemTextJsonReflection.html>) 1.46 ms 1.78 ms 1.22 0.07 False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'MicroBenchmarks.Serializers.Json_ToString&lt;MyEventsListerViewModel&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### MicroBenchmarks.Serializers.Json_ToString<MyEventsListerViewModel>.SystemTextJson_SourceGen_ ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.1498641028846155 > 951.2720519140626. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 3/10/2023 8:06:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -32.417678719110675 (T) = (0 -1169989.0603120094) / Math.Sqrt((373949062.6043899 / (43)) + (383176399.50074583 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.28317408347508105 = (911792.9323692815 - 1169989.0603120094) / 911792.9323692815 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### MicroBenchmarks.Serializers.Json_ToString<MyEventsListerViewModel>.SystemTextJson_Reflection_ ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 1.7816976195238092 > 1.527581492286383. IsChangePoint: Marked as a change because one of 3/10/2023 8:06:53 PM, 3/22/2023 10:50:22 PM, 3/23/2023 11:51:53 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -17.142555679411476 (T) = (0 -1775003.411711784) / Math.Sqrt((8725171752.854292 / (43)) + (688326514.5085528 / (6))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (6) - 2, .025) and -0.20790869212843227 = (1469484.7576467765 - 1775003.411711784) / 1469484.7576467765 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Globalization.Tests.Perf_DateTimeCultureInfo

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring%3a%20da).html>) 344.73 ns 581.18 ns 1.69 0.12 True
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring%3a%20).html>) 356.11 ns 565.42 ns 1.59 0.10 True
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring%3a%20fr).html>) 358.67 ns 580.46 ns 1.62 0.08 True
[ToString - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring%3a%20ja).html>) 349.36 ns 558.50 ns 1.60 0.09 True

graph graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Globalization.Tests.Perf_DateTimeCultureInfo*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring: da) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 581.1806198994343 > 371.1137097895988. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -19.421235468280077 (T) = (0 -590.6690485025102) / Math.Sqrt((90.01340326392385 / (43)) + (1043.273125159857 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.6784550424032257 = (351.9123441380863 - 590.6690485025102) / 351.9123441380863 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring: ) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 565.4241460552668 > 369.19276072376414. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -33.33524350429814 (T) = (0 -587.8731031250215) / Math.Sqrt((39.81464781494948 / (42)) + (349.33880517638755 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.6788987793365895 = (350.15398805478753 - 587.8731031250215) / 350.15398805478753 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring: fr) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 580.455666568876 > 364.66230665573227. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -58.53689009274283 (T) = (0 -574.4791172274151) / Math.Sqrt((160.86267299196928 / (43)) + (74.31578336912965 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.6289191957227552 = (352.67502447996964 - 574.4791172274151) / 352.67502447996964 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` #### System.Globalization.Tests.Perf_DateTimeCultureInfo.ToString(culturestring: ja) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 558.4961977774478 > 367.85619242284764. IsChangePoint: Marked as a change because one of 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -73.64962535247909 (T) = (0 -570.4353045781777) / Math.Sqrt((48.80038574832103 / (43)) + (52.90397976873987 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.6146262492733631 = (353.292475478392 - 570.4353045781777) / 353.292475478392 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked as regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Buffers.Text.Tests.Utf8FormatterTests

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[FormatterInt64 - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Buffers.Text.Tests.Utf8FormatterTests.FormatterInt64(value%3a%20-9223372036854775808).html>) 36.90 ns 136.44 ns 3.70 0.40 False
[FormatterInt64 - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Buffers.Text.Tests.Utf8FormatterTests.FormatterInt64(value%3a%209223372036854775807).html>) 33.22 ns 140.47 ns 4.23 0.42 False
[FormatterUInt64 - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Buffers.Text.Tests.Utf8FormatterTests.FormatterUInt64(value%3a%2018446744073709551615).html>) 32.76 ns 147.93 ns 4.52 0.45 False

graph graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Buffers.Text.Tests.Utf8FormatterTests*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Buffers.Text.Tests.Utf8FormatterTests.FormatterInt64(value: -9223372036854775808) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 136.44015653580198 > 35.792598546829836. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -175.60890137433225 (T) = (0 -138.32274463793962) / Math.Sqrt((3.246279537751629 / (43)) + (1.924934879637 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -3.0257993343610163 = (34.35907583801479 - 138.32274463793962) / 34.35907583801479 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Buffers.Text.Tests.Utf8FormatterTests.FormatterInt64(value: 9223372036854775807) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 140.4692937668984 > 34.36084923962718. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -136.3942956826316 (T) = (0 -140.87140646759366) / Math.Sqrt((5.94395574219608 / (43)) + (3.3382232232870894 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -3.155759103320809 = (33.89787592717424 - 140.87140646759366) / 33.89787592717424 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Buffers.Text.Tests.Utf8FormatterTests.FormatterUInt64(value: 18446744073709551615) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 147.9263102204153 > 35.412342602779404. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -94.97110814822713 (T) = (0 -147.71731017211124) / Math.Sqrt((6.301747962894654 / (42)) + (8.780595259255916 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -3.2002088922817444 = (35.16903895983718 - 147.71731017211124) / 35.16903895983718 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Run Information

Name Value
Architecture x64
OS ubuntu 18.04
Queue TigerUbuntu
Baseline 1411364699b5784040e86f76cb3db8200f6a2c8c
Compare 86b48d7c6f081c12dcc9c048fb53de1b78c9966f
Diff Diff
Configs AOT:true, CompilationMode:wasm, RunKind:micro

Regressions in System.Text.Json.Serialization.Tests.WriteJson<IndexViewModel>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[SerializeToWriter - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(IndexViewModel).SerializeToWriter(Mode%3a%20SourceGen).html>) 27.52 μs 32.01 μs 1.16 0.04 False
[SerializeToUtf8Bytes - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu 18.04_AOT=true_CompilationMode=wasm_RunKind=micro/System.Text.Json.Serialization.Tests.WriteJson(IndexViewModel).SerializeToUtf8Bytes(Mode%3a%20SourceGen).html>) 33.74 μs 35.95 μs 1.07 0.03 False

graph graph Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

Payloads

Baseline Compare

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Text.Json.Serialization.Tests.WriteJson&lt;IndexViewModel&gt;*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.Text.Json.Serialization.Tests.WriteJson<IndexViewModel>.SerializeToWriter(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 32.01199827380952 > 29.002805014581433. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 3/1/2023 3:09:21 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -21.30952675471099 (T) = (0 -32161.950693074992) / Math.Sqrt((123359.35798565281 / (43)) + (288230.3988306383 / (7))) is less than -2.010634757623041 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (43) + (7) - 2, .025) and -0.16151054533309847 = (27689.762113913115 - 32161.950693074992) / 27689.762113913115 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` #### System.Text.Json.Serialization.Tests.WriteJson<IndexViewModel>.SerializeToUtf8Bytes(Mode: SourceGen) ```log ``` ### Description of detection logic ``` IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 35.9536878591954 > 33.46590119859738. IsChangePoint: Marked as a change because one of 2/12/2023 11:53:55 PM, 3/1/2023 3:09:21 PM, 4/9/2023 4:09:45 AM, 4/11/2023 1:13:25 AM falls between 3/30/2023 4:24:24 AM and 4/11/2023 1:13:25 AM. IsRegressionStdDev: Marked as regression because -20.14335851154159 (T) = (0 -36358.61997125247) / Math.Sqrt((248191.0850875688 / (42)) + (301564.7202241952 / (7))) is less than -2.011740513728388 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (42) + (7) - 2, .025) and -0.13976317934000973 = (31900.153146117827 - 36358.61997125247) / 31900.153146117827 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
lewing commented 1 year ago

looks like https://github.com/dotnet/runtime/pull/84469

cc @stephentoub

stephentoub commented 1 year ago

Same comment/question as here: https://github.com/dotnet/perf-autofiling-issues/issues/15660#issuecomment-1503297512 "This change made a bunch of methods into generic methods, where the type parameter TChar is always either char or byte, and it's now invoking methods like TChar.CreateTruncating in a lot of places. There's a lot of IL in that code path, but it should resolve down to very simple asm assuming everything is properly getting inlined and evaluated for that TChar. Is it possible it's not? If those methods weren't inlined, it could definitely lead to significant regressions in these methods."

kg commented 1 year ago

Looks like the formatting changes knocked interp off a performance cliff. Not obvious to me whether there's a specific problem change from looking over the diff.

EDIT: I can look into what it would take to make CreateTruncating inline.

EDIT 2: Nevermind, this was an AOT run? The filter labels changed I guess.

stephentoub commented 1 year ago

For reference, this is all in support of being able to format all of these primitives to either UTF16 or UTF8, sharing all of the code to do so. We end up then at the leaf operations needing to store a value as a TChar (a char or a byte), and use TChar.CreateTruncating to do so. When everything inlines as expected, it should become a simple cast or possibly evaporate altogether (in the case where the input and output types match).

kg commented 1 year ago

I think interp would probably need to intrinsify it, looking at the implementation, since there are out params and throws in here.

stephentoub commented 1 year ago

cc: @tannergooding, @EgorBo

EgorBo commented 1 year ago

cc: @tannergooding, @EgorBo

Is it actionable on our side or just to be aware that such changes could regress mono?

SamMonoRT commented 1 year ago

cc: @tannergooding, @EgorBo

Is it actionable on our side or just to be aware that such changes could regress mono?

We should assume at this point, that any changes in the Libraries code will affect all runtimes in some fashion. Like measurements done in the PR for CoreCLR, I suggest for any future changes, create a similar table for Mono and NativeAOT prior to merging. Mono can't keep playing a catch-up game each time a new optimization is introduced, or PR authors needs to be willing to revert the commit till other runtimes update their code bases till regressions are fixed.

tannergooding commented 1 year ago

Mono can't keep playing a catch-up game each time a new optimization is introduced

It's worth noting that nothing here is a "new optimization". This is using fairly standard patterns that have been in use throughout the BCL and general .NET ecosystem for years. The "newest" thing here is utilizing generic math/static abstracts in interface; but that was a core .NET 7 feature that previewed in .NET 6 and one that the community is actively using and adopting in various places to simplify their code.

We do need to be mindful of how changes will impact all runtimes, but at the same time we shouldn't find ourselves unable to remove thousands of lines of code, simplify overall maintenance, or expose new functionality because one runtime is missing several widely used pieces of functionality. NAOT and Crossgen do not typically have problems here because it shares the same overall underlying codegen and optimizations with RyuJIT.

SamMonoRT commented 1 year ago

"We do need to be mindful of how changes will impact all runtimes" : Are all engineers mindful of changes to impact to all runtimes -- my honest opinion I don't feel that is being done.

In the past few months, Mono had several instances of both size and microbenchmark regressions identified after PRs are merged and perf lab runs identify the failures. I think this approach needs to change NOW, and impact on all runtimes identified prior to merging a PR. To know regressions will be introduced with a change is better for us, than reverse engineer PRs to figure out the regression after.

tannergooding commented 1 year ago

Are all engineers mindful of changes to impact to all runtimes -- my honest opinion I don't feel that is being done.

There is certainly more that can be done by engineers providing the PRs. Part of this may be due to a lack of integration and documentation on the Mono side. This latter bit is unfortunately something only the Mono team can provide as they are the ones with the context/knowledge to ensure it is correct/up to date.

For RyuJIT, whether you're working on Windows, Linux, or MacOS; you have the same general support and process for getting disassembly, dumps, debugging, running test, running benchmarks, etc. All of this works in VS or through the remote debuggers. The docs for this are all actively kept up to date and integration with the other relevant repos regularly happens.

For Mono, there is workflow on Linux/MacOS. However, much of this workflow differs or doesn't exist on Windows. There are no docs or integration for how to get disassembly, for how to run mono tests against dotnet/performance, there is nothing similar to the SPMI CI leg that allows us to easily see size increases or JIT throughput changes, etc.

Mono is also missing many pieces of functionality around generics, SIMD, and other core pieces of functionality that make this very difficult. A PR that removes thousands of lines of code while simultaneously not changing or significantly improving perf for RyuJIT, NAOT, and Crossgen can easily regress Mono by just as much. This is likewise something that we need to address longer term as maintaining split managed implementations for core functionality between Mono and RyuJIT isn't tenable. It also represents areas where Mono will actively be hurting for real world user code, particularly in several prominent libraries/applications and where those users will not be willing to spend the time to do significant additional testing.

SamMonoRT commented 1 year ago

We have had the same exact discussion over and over many times in past years. We had a good post-mortem discussion post RC1 regressions last year, and need to possibly revisit that with M2s and SteveC at some point and formulate a plan to avoid this in the future irrespective of how large/small the changes are. Mobile, WASM, MAUI are integral pieces of the .NET releases and there are many developers using Mono in the wild. We can't be regressing their experiences. @sblom team has made the first attempt to improve perf lab documentation. We can really use some engineers outside Mono team to help test that/clean it up. https://github.com/dotnet/performance/pull/2922/files explains some stuff. Do you want to volunteer help test those steps?

tannergooding commented 1 year ago

We can really use some engineers outside Mono team to help test that/clean it up. https://github.com/dotnet/performance/pull/2922/files explains some stuff. Do you want to volunteer help test those steps?

Happy to do so :). There should likely be a general call to action and request for testing/input across the libraries/runtime teams. Ideally this would go out on teams and in e-mail with relevant instructions and a general place to provide feedback.

We can't be regressing their experiences.

My main point is that the experience is already "regressed" because Mono is missing many of the core optimizations that .NET developers rely on in the wild. We can workaround these in the BCL and some of our own libraries, but that doesn't fix the issue for major community libraries or applications which are heavily utilizing the same optimizations/patterns or which are relying on NYI functionality such as SIMD. -- I'm happy to share more details of such applications/projects offline

stephentoub commented 1 year ago

I go into meetings for a few hours and come back to this fun discussion :smile:

I think this approach needs to change NOW, and impact on all runtimes identified prior to merging a PR.

Determining impact on all runtimes prior to merging any PR is simply not feasible. What we did decide last fall when we discussed this is that if a developer thought it likely a change could have a significantly different performance profile on one runtime vs another, more scouting would be done. And as this particular change involved generics and generics has been an area of problems in the past for mono, I did run it through various benchmarks and nothing popped as being too problematic. In case I somehow messed that up, I just did it again... here mono_main is just prior to my PR and mono_pr is with my PR:

// * Summary *

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22621.1413)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET SDK=8.0.100-preview.4.23210.9
  [Host]     : .NET 8.0.0 (8.0.23.21001), X64 RyuJIT AVX2
  Job-PONFFU : .NET 8.0.0 (42.42.42.42424) using MonoVM, X64 AOT
  Job-QAEWMK : .NET 8.0.0 (42.42.42.42424) using MonoVM, X64 AOT

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1

|   Method |              Toolchain | format |       Mean |     Error |    StdDev |     Median |        Min |        Max | Ratio | RatioSD |   Gen0 | Allocated | Alloc Ratio |
|--------- |----------------------- |------- |-----------:|----------:|----------:|-----------:|-----------:|-----------:|------:|--------:|-------:|----------:|------------:|
| ToString | \mono_main\corerun.exe |      o |   710.3 ns |  14.01 ns |  13.76 ns |   707.1 ns |   691.1 ns |   736.6 ns |  1.00 |    0.00 | 0.0171 |      80 B |        1.00 |
| ToString |   \mono_pr\corerun.exe |      o |   642.9 ns |  18.77 ns |  20.87 ns |   636.1 ns |   614.4 ns |   691.7 ns |  0.91 |    0.03 | 0.0179 |      80 B |        1.00 |
|          |                        |        |            |           |           |            |            |            |       |         |        |           |             |
| ToString | \mono_main\corerun.exe |      r |   456.3 ns |   7.95 ns |   7.44 ns |   453.7 ns |   447.7 ns |   472.8 ns |  1.00 |    0.00 | 0.0176 |      80 B |        1.00 |
| ToString |   \mono_pr\corerun.exe |      r |   521.7 ns |   4.86 ns |   3.80 ns |   521.4 ns |   516.0 ns |   528.4 ns |  1.14 |    0.02 | 0.0175 |      80 B |        1.00 |
|          |                        |        |            |           |           |            |            |            |       |         |        |           |             |
| ToString | \mono_main\corerun.exe |      s | 2,799.3 ns |  72.62 ns |  83.63 ns | 2,775.2 ns | 2,695.0 ns | 2,963.7 ns |  1.00 |    0.00 | 0.0111 |      64 B |        1.00 |
| ToString |   \mono_pr\corerun.exe |      s | 2,895.4 ns |  62.26 ns |  71.70 ns | 2,880.2 ns | 2,786.4 ns | 3,040.4 ns |  1.04 |    0.04 | 0.0115 |      64 B |        1.00 |

Now, it's quite likely that this isn't exact configuration that's registering all of these regression (and, btw, there were also perf-autofiling issues opened for this change with improvements), and for example on my machine in this configuration the "o" test is showing as a 10% improvement with my change whereas in the regression numbers in this issue it's showing as 40% regression, but it's not feasible to expect a developer to test every PR on every operating system with every combination of mono vs wasm with AOT vs JIT vs interpreter and whatever other axes are available. We agreed that that's what our perf lab is for, and that when such regressions came about, we'd figure out the right course of action for them. The system flagged this change (great!), and now we need to figure out what to do about it.

So, what can we do about it? For context, today our primitive types only support writing themselves out as UTF16. We've had many requests as well as a requirement from work being done in ASP.NET to support UTF8 as well, and we've chosen to do that by adding a new IUtf8SpanFormattable interface that we're rolling out across the types. The implementation of its TryFormat has the exact same semantics as ISpanFormattable, except that it writes out UTF8 bytes rather than UTF16 chars. How do we enable that? We can either duplicate literally thousands of lines of formatting code, one whole set dedicated to UTF16 and one whole set dedicated to UTF8, or we can take advantage of generics and have the single code path support both (and in doing so, also eliminate the duplication in the existing but limited Utf8Formatter), resulting in a net decrease of code to maintain while also supporting new scenarios rather than a net doubling. This PR was the first change to go in employing this: there have been a few more, and there's another big one currently out for PR that addresses the numeric types. Tanner is also working on doing the same for parsing.

I see the following options:

  1. We roll back all the changes and decide not to ship the functionality. I don't believe that's viable; it means we can't ship important new functionality.
  2. We roll back the changes and instead duplicate all the code. I don't believe that's viable; it's not maintainable.
  3. We temporarily roll back the changes knowing that we have a viable solution that will allow us to re-apply the changes in the near future. If we have such a plan, I'm ok with this.
  4. We (e.g. me, Tanner, and someone very familiar with all the various mono backends) investigate what tweaks could be made to the library code to mitigate the regressions. This is viable, assuming there are such tweaks to be made; for example, as a temporary mitigation, maybe we could define some internal helpers that are easier for mono to handle and that address just the needs of the code.
  5. We (e.g. me, Tanner, and someone very familiar with all the various mono backends) investigate what improvements need to be made to mono's backends to address the regressions. This is also viable, and is really the only longer-term viable answer. As Tanner outlines, these code patterns are not new; the only new thing here is that we're using them on code paths measured by our existing microbenchmarks.

Are there other options you had in mind, Sam?

Thanks!

cc: @jeffhandley, who drove the discussion in the fall about all of this

tannergooding commented 1 year ago

We (e.g. me, Tanner, and someone very familiar with all the various mono backends) investigate what tweaks could be made to the library code to mitigate the regressions. This is viable, assuming there are such tweaks to be made; for example, as a temporary mitigation, maybe we could define some internal helpers that are easier for mono to handle and that address just the needs of the code.

If the issue is just or primarily with CreateTruncating, then exposing a non-generic static abstract method on the IBinaryIntegerParseAndFormatInfo interface being introduced in https://github.com/dotnet/runtime/pull/84582/files#diff-cf99b7ffb1692b4c65f25cd8a5ab29709b9e04251535dd2058b93b6a6335da72 (one that is effectively CreateTruncating(int other)) would be a viable alternative and would help avoid the complex typeof and other checks that may be causing issues atm.

jeffhandley commented 1 year ago

Echoing @stephentoub's comments and referring back to the playbook we wrote up after last year's SpanHelpers perf regression:

  1. We need to continually increase libraries/coreclr engineers' ability to test mono/wasm scenarios.
  2. We cannot overly hinder libraries/coreclr work by trying to preemptively run all mono/wasm perf configurations ahead of merge; instead we must rely on post-merge runs with auto-filing of issues (as was done here and caught swiftly! 🎉).
  3. When a regression is detected, "stop the presses" (hold up any subsequent PRs that would make mitigation or reverting trickier).
  4. We then need to determine the most appropriate way to address the regression.

Stephen laid out some options for us. I concur that options 1-2 (rolling back the changes indefinitely) are not good. Between options 3-5, it seems the first step is to root cause how the regression surfaces in the mono side of things. From that root cause, we can choose between option 4 and 5 (work around the limitation in the Libraries code, or address the underlying issue on the mono side).

Whether or not we need to temporarily roll back the changes (option 3) depends on knowing that root cause and knowing what options 4 and 5 looks like, how long it would take, etc. The other factor for whether or not we need to temporarily roll this back is what it does to .NET 8 Preview 4. I believe the new functionality is valuable to ASP.NET scenarios that are to be highlighted in Preview 4 (and BUILD), but I don't know how adversely this affects mono/wasm scenarios that are under the spotlight for Preview 4 (or BUILD); I'd need @SamMonoRT or @lewing's input on that.

stephentoub commented 1 year ago

If the issue is just or primarily with CreateTruncating

I was able to repro at least some of the mono regressions with AOT, and at least locally it appears to be a bunch of things, presumably based on their nature all related to inlining though I'm not sure how to tell for sure on mono. I've been iterating on a commit that should hopefully address most things, plus I saw Zoltan put up a few PRs that seem to be related.

vargaz commented 1 year ago

Adding a fastpath like this to DateTimeFormat:FormatDigits() might help:

            if (len == 2 && value < 100)
            {
                outputBuffer.Append(TChar.CreateTruncating((value / 10) + '0'));
                outputBuffer.Append(TChar.CreateTruncating((value % 10) + '0'));
                return;
            }
            else if (len == 4 && value < 10000)
            {
                int d;

                d = value / 1000;
                outputBuffer.Append(TChar.CreateTruncating(d + '0'));
                value -= d * 1000;

                d = value / 100;
                outputBuffer.Append(TChar.CreateTruncating(d + '0'));
                value -= d * 100;

                d = value / 10;
                outputBuffer.Append(TChar.CreateTruncating(d + '0'));
                value -= d * 10;

                d = value;
                outputBuffer.Append(TChar.CreateTruncating(d + '0'));
                return;
            }
stephentoub commented 1 year ago

@SamMonoRT, @jeffhandley, as for what we could do for the future, I've mentioned this before, but it'd also be really helpful if we could run benchmarks on mono on PRs via the /benchmark CI command or similar. It'd then be trivial to validate a particular change before it's merged against any number of desired targets. Right now to my knowledge all flavors of mono and wasm are missing from that.

stephentoub commented 1 year ago

Adding a fastpath like this to DateTimeFormat:FormatDigits() might help:

Thanks. I've been experimenting with a bunch of changes locally. I'll explore this as well. It's all on top of my pending numerics PR (to try to get a complete picture), so I'll aim to push up an additional commit to that today.

SamMonoRT commented 1 year ago

Thank you @stephentoub for running some numbers for Mono, unfortunately as we learnt, that wasn't enough.

  1. As for current changes, I don't think there is a need to revert existing changes, but really understand the limitations, document findings and ensure subsequent PRs have gone through a validation on the two configs below. Options 4 and 5 seems like options we should try to iterate on for already merged PRs.
  2. I suggest we concentrate on 2 configs for future validation.
    1. Mono Interpreter
    2. Mono AOT-WASM (llvm is enabled by default in this configuration)
  3. Start adding @vargaz @lambdageek @lateralusX on your PRs for understanding Mono limitations/impact. We might possibly avoid an unknown regression.
  4. Running Microbenchmarks locally and measuring app sizes: I'll provide further documentation we have been using.
  5. In the past we have followed https://gist.github.com/naricc/8b6f19bad9f711cf312033c2f26a4529 to kick off runs against PR changes on perf lab machines. I'll have someone validate that the steps still work and circle back. It is a little convoluted as it needs some help from Drew or others on Scott's team to get the final report. Again, this is really not an option I would suggest/agree on as a long term plan for all contributors should do.
stephentoub commented 1 year ago

unfortunately as we learnt, that wasn't enough

Nothing would ever be "enough". That's the whole point of having our whole perf lab in place for catching and acting quickly upon regressions.

I suggest we concentrate on 2 configs for future validation.

I'm fairly confident the numbers I shared were the mono interpreter.

Start adding @vargaz @lambdageek @lateralusX on your PRs for understanding Mono limitations/impact. We might possibly avoid an unknown regression.

On which PRs?

Running Microbenchmarks locally and measuring app sizes: I'll provide further documentation we have been using.

This is not feasible on every PR. The time it takes to get the necessary builds locally and get everything set up to do such comparisons across multiple platforms is prohibitive. For one off's where there is an expected impact, sure, totally valid. For everything else, I don't believe it's a reasonable request. There are always going to be regressions, and we react. This isn't limited to mono; similar regressions happen with coreclr and nativeaot, and we react.

jandupej commented 1 year ago

This is not feasible on every PR.

Can we take a sampling approach and do small subset ob benchmarks? One that would take few tens of minutes at most, as opposed to several hours that the full set would take.

stephentoub commented 1 year ago

Can we take a sampling approach and do small subset ob benchmarks? One that would take few tens of minutes at most, as opposed to several hours that the full set would take.

It's not just about running the benchmarks. It's also about getting all the relevant builds created to do the before/after comparison, getting the environment set up appropriately for each (on all the relevant operating systems), and then running the tests on each before/after build for each configuration, and doing so enough times to have confidence in the results.

If this is something we care about, we need to invest in the automation that enables it to be done automatically and without requiring physical resources or time investment on the part of the developer. This would be trivial if I could issue a command like /benchmark microbenchmarks runtime mono-wasm --variable filter="System.Text.Json.Tests* on a PR and let the machines do what the machines are good at doing.

SamMonoRT commented 1 year ago

On which PRs? I would certainly add them on all PRs involving changes you believe will impact more than one runtime, or when not clear what implications the changes have on other runtimes.

This is not feasible on every PR. The time it takes to get the necessary builds locally and get everything set up to do such comparisons across multiple platforms is prohibitive. For one off's where there is an expected impact, sure, totally valid. For everything else, I don't believe it's a reasonable request. There are always going to be regressions, and we react.

I agree it isn't possible on all PRs, but as same time for expected impact PRs, we need to start adopting this practice. It will be time consuming and not smooth at first, but we need to start somewhere.

This isn't limited to mono; similar regressions happen with coreclr and nativeaot, and we react. From my observations a regression in Mono caused by a Libraries/shared framework change isn't treated as equally as a regression in coreclr or nativeaot. We really need the same level of investigation, passion and collaboration to address them in time for a Preview snap. If we (Mono and PR authors working together) fail to fix the regression due to any other priorities from either side, we adopt a policy to revert a PR prior to Preview/RS snaps, till we have a good plan/proposal for a long-term fix and move ahead in the subsequent Preview.

tannergooding commented 1 year ago

I would certainly add them on all PRs involving changes you believe will impact more than one runtime, or when not clear what implications the changes have on other runtimes.

This is essentially any libraries change, particularly for types in S.P.Corelib. This comes down to practically every PR that some areas do. Those area owners rely heavily on CI and automation to help catch regressions outside the "core" targets that are easily tested locally (which is often just the box being developed on).

I agree it isn't possible on all PRs, but as same time for expected impact PRs, we need to start adopting this practice

This isn't possible to do locally. We simply do not have the hardware, time, or resources per developer. If we want this done, it needs to be part of CI. At the very least this needs to be available via an explicit trigger, but more ideally automatable.

We're never going to end up in a world where any developer on the team get and test the coverage of x86, x64, Arm32, Arm64, and WASM across the range of Windows, Linux, MacOS, Android, iOS, etc; particularly not faster or more reliably than what CI can do and especially not with the run to run stability allowing easily comparable results.

From my observations a regression in Mono caused by a Libraries/shared framework change isn't treated as equally as a regression in coreclr or nativeaot. We really need the same level of investigation, passion and collaboration to address them in time for a Preview snap.

Mono changes, historically, were not part of the normal results seen for perf triage, didn't have documentation covering valid workflows across the range of devices (so it was difficult to even get working locally in the first place), didn't have integration with many of the core tooling used across the team, etc

A lot of this has been actively improved, but there is still a lot missing, improvements needed, and general effort (particularly on behalf of developers) to ensure that it is treated equivalently.

Us doing mono perf triage as part of the normal weekly triage has helped with this a lot. Ensuring that there is first class support for testing and validating everything in Windows or WSL would be another huge boon, as would ensuring that coreclr/mono can be built SxS and trivially tested end-to-end.

If we (Mono and PR authors working together) fail to fix the regression due to any other priorities from either side, we adopt a policy to revert a PR prior to Preview/RS snaps, till we have a good plan/proposal for a long-term fix and move ahead in the subsequent Preview.

We don't really do this for any other runtimes. We revert bugs, but perf regressions get filed, get investigated, and then the area owners make the decision around whether something is impactful enough to be reverted or if it another fix can be done instead. Most often, an alternative fix goes in instead.

The single most impactful thing to help avoid these regressions would be ensuring that generic specialization for value types and SIMD support is on for all core platforms Mono supports. Beyond that, there is high reliance on the RyuJIT inlining heuristics and few other key patterns (such as box elision and typeof(T) == typeof(...) checks being constant folded).

Having up for grabs issues that give some details on how this work can be contributed to or assisted with would be massively helpful. I've done several larger contributions on the Mono SIMD side where possible, but there is still a lot more to be done.

stephentoub commented 1 year ago

From my observations a regression in Mono caused by a Libraries/shared framework change isn't treated as equally as a regression in coreclr or nativeaot. We really need the same level of investigation, passion and collaboration to address them in time for a Preview snap.

Preview 4 snap is two weeks from now; the regressions were flagged only yesterday and fixes are already in the works. With all due respect, what about the involvement in this issue suggests a lack of "investigation, passion, and collaboration"? I was exploring root causes and possible fixes on this until midnight last night and it's the only thing other than meetings I've done today. As I write this I'm taking a break from stumbling through creating the various builds to validate the changes I have on various flavors of mono. I really hope your perspective here isn't based on this issue. From my perspective, the right things have already happened here. I'm not sure why this is the one that's causing red flags to be raised.

Frankly, I think we should be celebrating (to some extent) when regressions like this happen, as they help to shine a light on real patterns real developers employ and that we’re deficient in handling. We investigate, we evaluate, we make fixes that address more than just the particular code path originally reported. All boats rise.

It will be time consuming

Too time consuming. I'm sorry, it's simply not practical, as Tanner outlined. If all regressions must be caught prior to merge, what's the point of our regression system at all? Things invariably slip through to being merged; there are too many operating systems and subsystems and backends and whatnot involved for that to not be the case, and we have zero pre-merge perf automation tooling in place. So when an issue arises, we address it. From my perspective, we're following exactly the playbook we all agreed to just a few months ago. If there's an action to be taken, it's on improving our ability to automate all of this.

I'd also like to call out that not all regressions are created equally. One of the worst regressions on mono reported in this set, for example, was a particular use of DateTime.ToString/TryFormat regressing 2x by ~300ns. For the server workloads that coreclr often enables, serving millions of requests per second and formatting millions of DateTimes per second, that kind of regression could actually make a noticeable difference. For the client workloads that mono is generally used for today, in what scenario is an extra 300ns when calling DateTime.ToString prohibitive? If we really care about that level of throughput on these microoperations on mono, then we should also care about the gap between coreclr and mono... if you look at the numbers I just posted in https://github.com/dotnet/runtime/pull/84587#issuecomment-1505850063, which were all done on the same machine on the same OS, there's a much larger gap between coreclr and mono than any of the regressions here represent. I'm in no way throwing shade at mono (it does a some really cool things coreclr doesn't today, like support wasm); rather, I'm highlighting that they're focused on different things, and we need to use that difference as a lens through which we evaluate the severity of these kinds of issues.

SamMonoRT commented 1 year ago

All right things are happening in this case, and I agree we should celebrate the systems in place are catching issues and we are actively investigating. My perspective is not built on this issue but past occurrences. I will admit the mindset is improving for a certain set of individuals. This issue had 2-3 other follow up PRs in the pipeline and we want to avoid a scenario where we think more regressions will slide in and we have not time to fix those prior to the snap.

Too long a time to get results isn't acceptable as a reason to not attempt the run those. I can check if some engineers outside Mono can get hands on MacBooks, build on that, run the microbenchmarks locally to get numbers prior to checkin.

The gap between coreclr - mono is huge, we all agree on that. We are trying to narrow the gap the past couple years, but with more regressions introduced in favor of other runtimes is only making it worse.

jeffhandley commented 1 year ago

At risk of being the stereotypical manager type that swoops in and simply repeats what others have already said... We are following our playbook here, and that's indeed worthy of celebration. Capturing here the details of that playbook and noting where it's working:

  1. ✅ Modify the perf v-team issue triage routines a. ✅ Mono team members will be included in the perf v-team issue triage routines and conversations, with details to be determined by the v-team and Larry b. ✅ Auto-filed issues will be assigned to the team member(s) who need to investigate (instead of tagging them through comments) c. ✅ The v-team will follow up with assignees to ensure they've acknowledged the assignment
  2. Improve auto-filed issue tooling a. ✅ GitHub labels will be updated to align better across the repositories b. Auto-filed issue generation will be updated with sorting, emphasis, and the ‘blocking’ label to highlight serious regressions c. (Semi-)automated commit bisect tooling will be reevaluated for potential engineering investment
  3. ✅ Invest in education about different optimization modes a. ✅ Documentation will be augmented to improve the team’s ability to understand, test, and profile more configurations
    • A couple pieces of documentation have been shared in this issue b. ✅ Libraries work that exercises new runtime features or patterns will be tested in more scenarios during implementation
    • As was noted, on the changes that caused this regression, we did test more scenarios, but more will never be enough c. ✅ Regressions will sneak through, and quick reverts are expected to be followed by updated implementations
    • We have halted further changes from compounding the regression
    • We are actively investigating the best ways to mitigate the regressions

I think we're demonstrating that our learnings last fall have been valuable. Concrete action items to help us go even further:

  1. [ ] @SamMonoRT Further improve documentation for measuring mono configurations
    • Is there content in that gist that needs to be promoted out into the docs?
    • It sounds like you're seeking review/refinement from someone outside the mono team on the docs that have been added; rigth?
  2. [ ] Get mono scenarios integrated into our automation such that /benchmark can be invoked on a PR for that coverage
    • @sblom I'm not sure who needs to be involved in that, can you advise?
sblom commented 1 year ago

Get mono scenarios integrated into our automation such that /benchmark can be invoked on a PR for that coverage

@cincuranet and @LoopedBard3 are where to start exploring our options for that.

lewing commented 1 year ago

It has been another long day so I'm not going to go point by point through the history but I think it is clear everyone here wants to improve the state of things and end up in a better place. I also assume we all also know it isn't great when you have to wait for the once a week report on Tuesday morning to understand if your schedule is still intact. Doubly so when there are often long gaps between changes and reports due to infrastructure.

Text serialization is an ongoing concern for Wasm and due to size constraints we have some very brute force methods to decide to what to Aot. It's not because we don't want to improve things. It would be great if big text serialization changes were tested on Wasm prior to landing them with and without Aot. Even Just as a heads up. I haven't seen any feedback on the wasm docs. If they don't work for you please file an issue. I'm not at all happy with how hard it is to make those measurements, but that is also not entirely within my control. That said, if the documentation or procedures are incorrect or too onerous, please file an issue so that we can work on that.

I'm optimistic we'll improve the performance of these changes soon but it doesn't always feel like we're working on solving this as a team.