dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15k stars 4.67k forks source link

Perf_Timer.AsynchronousContention and Perf_Timer.SynchronousContention benchmark issue on Mono Interpreter #78209

Open cincuranet opened 1 year ago

cincuranet commented 1 year ago

Benchmarks Perf_Timer.AsynchronousContention and Perf_Timer.SynchronousContention are running very slow in dotnet-runtime-perf-slow on arm64/Mono/Interpreter pipeline (Performance Linux arm64 release mono Interpreter micro_mono perfampere NoJS False).

Trying to reproduce it on similar machine (VM in Azure - Ubuntu 20.04+arm64) was not successful. I and @BrzVlad were not able to build the runtime and run the benchmark on Helix machine.

This is recent run:

[2022/11/06 09:20:40][INFO] // Found 1 benchmarks:
[2022/11/06 09:20:40][INFO] //   Perf_Timer.AsynchronousContention: Job-WRTDBJ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/11/06 09:20:40][INFO] 
[2022/11/06 09:20:40][INFO] // **************************
[2022/11/06 09:20:40][INFO] // Benchmark: Perf_Timer.AsynchronousContention: Job-WRTDBJ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/11/06 09:20:40][INFO] // *** Execute ***
[2022/11/06 09:20:40][INFO] // Launch: 1 / 1
[2022/11/06 09:20:40][INFO] // Execute: /home/helixbot/work/ABE0092A/p/dotnet-mono/shared/Microsoft.NETCore.App/e2174923-60b2-43dd-b6a9-8441df07c287/corerun b5a5684a-08f5-4df4-b63f-edc7eea54efc.dll --anonymousPipes 541 542 --benchmarkName System.Threading.Tests.Perf_Timer.AsynchronousContention --job "PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1" --benchmarkId 149 in /home/helixbot/work/ABE0092A/w/C6760A99/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/b5a5684a-08f5-4df4-b63f-edc7eea54efc/bin/Release/net8.0/publish
[2022/11/06 09:20:40][INFO] Failed to set up high priority. Make sure you have the right permissions. Message: Permission denied
[2022/11/06 09:20:40][INFO] // BeforeAnythingElse
[2022/11/06 09:20:40][INFO] 
[2022/11/06 09:20:40][INFO] // Benchmark Process Environment Information:
[2022/11/06 09:20:40][INFO] // Runtime=.NET 8.0.0 (42.42.42.42424), Arm64 AOT
[2022/11/06 09:20:40][INFO] // GC=Non-concurrent Workstation
[2022/11/06 09:20:40][INFO] // HardwareIntrinsics=
[2022/11/06 09:20:40][INFO] // Job: Job-QZTFHS(PowerPlanMode=00000000-0000-0000-0000-000000000000, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/11/06 09:20:40][INFO] 
[2022/11/06 09:20:40][INFO] OverheadJitting  1: 1 op, 35480.00 ns, 35.4800 us/op
[2022/11/06 09:28:22][INFO] WorkloadJitting  1: 1 op, 461892819541.00 ns, 461.8928 s/op
[2022/11/06 09:28:22][INFO] 
[2022/11/06 09:35:53][INFO] WorkloadWarmup   1: 1 op, 450740774849.00 ns, 450.7408 s/op
[2022/11/06 09:35:53][INFO] 
[2022/11/06 09:35:53][INFO] // BeforeActualRun
[2022/11/06 09:43:23][INFO] WorkloadActual   1: 1 op, 450043300789.00 ns, 450.0433 s/op
[2022/11/06 09:50:51][INFO] WorkloadActual   2: 1 op, 447471732038.00 ns, 447.4717 s/op
[2022/11/06 09:58:18][INFO] WorkloadActual   3: 1 op, 447013733387.00 ns, 447.0137 s/op
[2022/11/06 10:05:46][INFO] WorkloadActual   4: 1 op, 448038099850.00 ns, 448.0381 s/op
[2022/11/06 10:13:14][INFO] WorkloadActual   5: 1 op, 448591081197.00 ns, 448.5911 s/op
[2022/11/06 10:20:41][INFO] WorkloadActual   6: 1 op, 447030699880.00 ns, 447.0307 s/op
[2022/11/06 10:28:08][INFO] WorkloadActual   7: 1 op, 447022293085.00 ns, 447.0223 s/op
[2022/11/06 10:35:37][INFO] WorkloadActual   8: 1 op, 448186147351.00 ns, 448.1861 s/op
[2022/11/06 10:43:04][INFO] WorkloadActual   9: 1 op, 447640623738.00 ns, 447.6406 s/op
[2022/11/06 10:50:33][INFO] WorkloadActual  10: 1 op, 448467970186.00 ns, 448.4680 s/op
[2022/11/06 10:58:01][INFO] WorkloadActual  11: 1 op, 447736490717.00 ns, 447.7365 s/op
[2022/11/06 11:05:29][INFO] WorkloadActual  12: 1 op, 448190031399.00 ns, 448.1900 s/op
[2022/11/06 11:12:59][INFO] WorkloadActual  13: 1 op, 449815026432.00 ns, 449.8150 s/op
Killed

It's eventually killed, because pipeline runs out of time.

Below is one from September that finished. But the times are still in 167s.

[2022/09/23 21:20:32][INFO] // Found 1 benchmarks:
[2022/09/23 21:20:32][INFO] //   Perf_Timer.AsynchronousContention: Job-DUILMZ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/09/23 21:20:32][INFO] 
[2022/09/23 21:20:32][INFO] // **************************
[2022/09/23 21:20:32][INFO] // Benchmark: Perf_Timer.AsynchronousContention: Job-DUILMZ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/09/23 21:20:32][INFO] // *** Execute ***
[2022/09/23 21:20:32][INFO] // Launch: 1 / 1
[2022/09/23 21:20:32][INFO] // Execute: /home/helixbot/work/B2030979/p/dotnet-mono/shared/Microsoft.NETCore.App/7b7a15c3-9555-44f0-a930-533a9f2547dc/corerun 3b989c30-9124-438b-b4b7-cd4fd717963a.dll --benchmarkName System.Threading.Tests.Perf_Timer.AsynchronousContention --job "PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1" --benchmarkId 150 in /home/helixbot/work/B2030979/w/BDB30A21/e/performance/artifacts/bin/MicroBenchmarks/Release/net7.0/3b989c30-9124-438b-b4b7-cd4fd717963a/bin/Release/net7.0/publish
[2022/09/23 21:20:32][INFO] Failed to set up high priority. Make sure you have the right permissions. Message: Permission denied
[2022/09/23 21:20:32][INFO] // BeforeAnythingElse
[2022/09/23 21:20:32][INFO] 
[2022/09/23 21:20:32][INFO] // Benchmark Process Environment Information:
[2022/09/23 21:20:32][INFO] // Runtime=.NET 8.0.0 (42.42.42.42424), Arm64 AOT
[2022/09/23 21:20:32][INFO] // GC=Non-concurrent Workstation
[2022/09/23 21:20:32][INFO] // HardwareIntrinsics=
[2022/09/23 21:20:33][INFO] // Job: Job-GJKJNQ(PowerPlanMode=00000000-0000-0000-0000-000000000000, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2022/09/23 21:20:33][INFO] 
[2022/09/23 21:20:33][INFO] OverheadJitting  1: 1 op, 77444.00 ns, 77.4440 us/op
[2022/09/23 21:23:32][INFO] WorkloadJitting  1: 1 op, 179874620266.00 ns, 179.8746 s/op
[2022/09/23 21:23:32][INFO] 
[2022/09/23 21:26:20][INFO] WorkloadWarmup   1: 1 op, 167329988104.00 ns, 167.3300 s/op
[2022/09/23 21:26:20][INFO] 
[2022/09/23 21:26:20][INFO] // BeforeActualRun
[2022/09/23 21:29:07][INFO] WorkloadActual   1: 1 op, 167549719942.00 ns, 167.5497 s/op
[2022/09/23 21:31:55][INFO] WorkloadActual   2: 1 op, 167296143533.00 ns, 167.2961 s/op
[2022/09/23 21:34:42][INFO] WorkloadActual   3: 1 op, 167278742826.00 ns, 167.2787 s/op
[2022/09/23 21:37:29][INFO] WorkloadActual   4: 1 op, 167452413521.00 ns, 167.4524 s/op
[2022/09/23 21:40:17][INFO] WorkloadActual   5: 1 op, 167320729619.00 ns, 167.3207 s/op
[2022/09/23 21:43:04][INFO] WorkloadActual   6: 1 op, 167412709896.00 ns, 167.4127 s/op
[2022/09/23 21:45:52][INFO] WorkloadActual   7: 1 op, 167387006109.00 ns, 167.3870 s/op
[2022/09/23 21:48:39][INFO] WorkloadActual   8: 1 op, 167426540496.00 ns, 167.4265 s/op
[2022/09/23 21:51:26][INFO] WorkloadActual   9: 1 op, 167294036599.00 ns, 167.2940 s/op
[2022/09/23 21:54:14][INFO] WorkloadActual  10: 1 op, 167317449832.00 ns, 167.3174 s/op
[2022/09/23 21:57:01][INFO] WorkloadActual  11: 1 op, 167463292190.00 ns, 167.4633 s/op
[2022/09/23 21:59:49][INFO] WorkloadActual  12: 1 op, 167331924572.00 ns, 167.3319 s/op
[2022/09/23 22:02:36][INFO] WorkloadActual  13: 1 op, 167407704498.00 ns, 167.4077 s/op
[2022/09/23 22:05:23][INFO] WorkloadActual  14: 1 op, 167382575539.00 ns, 167.3826 s/op
[2022/09/23 22:08:11][INFO] WorkloadActual  15: 1 op, 167165982642.00 ns, 167.1660 s/op
[2022/09/23 22:08:11][INFO] 
[2022/09/23 22:08:11][INFO] // AfterActualRun
[2022/09/23 22:10:58][INFO] WorkloadResult   1: 1 op, 167549719942.00 ns, 167.5497 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   2: 1 op, 167296143533.00 ns, 167.2961 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   3: 1 op, 167278742826.00 ns, 167.2787 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   4: 1 op, 167452413521.00 ns, 167.4524 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   5: 1 op, 167320729619.00 ns, 167.3207 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   6: 1 op, 167412709896.00 ns, 167.4127 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   7: 1 op, 167387006109.00 ns, 167.3870 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   8: 1 op, 167426540496.00 ns, 167.4265 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult   9: 1 op, 167294036599.00 ns, 167.2940 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  10: 1 op, 167317449832.00 ns, 167.3174 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  11: 1 op, 167463292190.00 ns, 167.4633 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  12: 1 op, 167331924572.00 ns, 167.3319 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  13: 1 op, 167407704498.00 ns, 167.4077 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  14: 1 op, 167382575539.00 ns, 167.3826 s/op
[2022/09/23 22:10:58][INFO] WorkloadResult  15: 1 op, 167165982642.00 ns, 167.1660 s/op
[2022/09/23 22:10:58][INFO] GC:  3569 1 1 14720017296 1
[2022/09/23 22:10:58][INFO] Threading:  46000046 961262 1
[2022/09/23 22:10:58][INFO] 
[2022/09/23 22:10:58][INFO] // AfterAll
[2022/09/23 22:10:58][INFO] // Benchmark Process 3194 has exited with code 0.
[2022/09/23 22:10:58][INFO] 
[2022/09/23 22:10:58][INFO] Mean = 167.366 s, StdErr = 0.024 s (0.01%), N = 15, StdDev = 0.093 s
[2022/09/23 22:10:58][INFO] Min = 167.166 s, Q1 = 167.307 s, Median = 167.383 s, Q3 = 167.420 s, Max = 167.550 s
[2022/09/23 22:10:58][INFO] IQR = 0.113 s, LowerFence = 167.138 s, UpperFence = 167.589 s
[2022/09/23 22:10:58][INFO] ConfidenceInterval = [167.266 s; 167.466 s] (CI 99.9%), Margin = 0.100 s (0.06% of Mean)
[2022/09/23 22:10:58][INFO] Skewness = -0.11, Kurtosis = 2.65, MValue = 2

Similar behavior is for Perf_Timer.SynchronousContention.

dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost commented 1 year ago

Tagging subscribers to this area: @brzvlad See info in area-owners.md if you want to be subscribed.

Issue Details
Benchmarks `Perf_Timer.AsynchronousContention` and `Perf_Timer.SynchronousContention` are running very slow in _dotnet-runtime-perf-slow_ on arm64/Mono/Interpreter pipeline (`Performance Linux arm64 release mono Interpreter micro_mono perfampere NoJS False`). Trying to reproduce it on similar machine (VM in Azure - Ubuntu 20.04+arm64) was not successful. I and @BrzVlad were not able to build the runtime and run the benchmark on Helix machine. [This](https://dev.azure.com/dnceng/internal/_build/results?buildId=2038592&view=results) is recent run: ```text [2022/11/06 09:20:40][INFO] // Found 1 benchmarks: [2022/11/06 09:20:40][INFO] // Perf_Timer.AsynchronousContention: Job-WRTDBJ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/11/06 09:20:40][INFO] [2022/11/06 09:20:40][INFO] // ************************** [2022/11/06 09:20:40][INFO] // Benchmark: Perf_Timer.AsynchronousContention: Job-WRTDBJ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/11/06 09:20:40][INFO] // *** Execute *** [2022/11/06 09:20:40][INFO] // Launch: 1 / 1 [2022/11/06 09:20:40][INFO] // Execute: /home/helixbot/work/ABE0092A/p/dotnet-mono/shared/Microsoft.NETCore.App/e2174923-60b2-43dd-b6a9-8441df07c287/corerun b5a5684a-08f5-4df4-b63f-edc7eea54efc.dll --anonymousPipes 541 542 --benchmarkName System.Threading.Tests.Perf_Timer.AsynchronousContention --job "PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1" --benchmarkId 149 in /home/helixbot/work/ABE0092A/w/C6760A99/e/performance/artifacts/bin/MicroBenchmarks/Release/net8.0/b5a5684a-08f5-4df4-b63f-edc7eea54efc/bin/Release/net8.0/publish [2022/11/06 09:20:40][INFO] Failed to set up high priority. Make sure you have the right permissions. Message: Permission denied [2022/11/06 09:20:40][INFO] // BeforeAnythingElse [2022/11/06 09:20:40][INFO] [2022/11/06 09:20:40][INFO] // Benchmark Process Environment Information: [2022/11/06 09:20:40][INFO] // Runtime=.NET 8.0.0 (42.42.42.42424), Arm64 AOT [2022/11/06 09:20:40][INFO] // GC=Non-concurrent Workstation [2022/11/06 09:20:40][INFO] // HardwareIntrinsics= [2022/11/06 09:20:40][INFO] // Job: Job-QZTFHS(PowerPlanMode=00000000-0000-0000-0000-000000000000, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/11/06 09:20:40][INFO] [2022/11/06 09:20:40][INFO] OverheadJitting 1: 1 op, 35480.00 ns, 35.4800 us/op [2022/11/06 09:28:22][INFO] WorkloadJitting 1: 1 op, 461892819541.00 ns, 461.8928 s/op [2022/11/06 09:28:22][INFO] [2022/11/06 09:35:53][INFO] WorkloadWarmup 1: 1 op, 450740774849.00 ns, 450.7408 s/op [2022/11/06 09:35:53][INFO] [2022/11/06 09:35:53][INFO] // BeforeActualRun [2022/11/06 09:43:23][INFO] WorkloadActual 1: 1 op, 450043300789.00 ns, 450.0433 s/op [2022/11/06 09:50:51][INFO] WorkloadActual 2: 1 op, 447471732038.00 ns, 447.4717 s/op [2022/11/06 09:58:18][INFO] WorkloadActual 3: 1 op, 447013733387.00 ns, 447.0137 s/op [2022/11/06 10:05:46][INFO] WorkloadActual 4: 1 op, 448038099850.00 ns, 448.0381 s/op [2022/11/06 10:13:14][INFO] WorkloadActual 5: 1 op, 448591081197.00 ns, 448.5911 s/op [2022/11/06 10:20:41][INFO] WorkloadActual 6: 1 op, 447030699880.00 ns, 447.0307 s/op [2022/11/06 10:28:08][INFO] WorkloadActual 7: 1 op, 447022293085.00 ns, 447.0223 s/op [2022/11/06 10:35:37][INFO] WorkloadActual 8: 1 op, 448186147351.00 ns, 448.1861 s/op [2022/11/06 10:43:04][INFO] WorkloadActual 9: 1 op, 447640623738.00 ns, 447.6406 s/op [2022/11/06 10:50:33][INFO] WorkloadActual 10: 1 op, 448467970186.00 ns, 448.4680 s/op [2022/11/06 10:58:01][INFO] WorkloadActual 11: 1 op, 447736490717.00 ns, 447.7365 s/op [2022/11/06 11:05:29][INFO] WorkloadActual 12: 1 op, 448190031399.00 ns, 448.1900 s/op [2022/11/06 11:12:59][INFO] WorkloadActual 13: 1 op, 449815026432.00 ns, 449.8150 s/op Killed ``` It's eventually killed, because pipeline runs out of time. Below is one from September that finished. But the times are still in _167s_. ```text [2022/09/23 21:20:32][INFO] // Found 1 benchmarks: [2022/09/23 21:20:32][INFO] // Perf_Timer.AsynchronousContention: Job-DUILMZ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/09/23 21:20:32][INFO] [2022/09/23 21:20:32][INFO] // ************************** [2022/09/23 21:20:32][INFO] // Benchmark: Perf_Timer.AsynchronousContention: Job-DUILMZ(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/09/23 21:20:32][INFO] // *** Execute *** [2022/09/23 21:20:32][INFO] // Launch: 1 / 1 [2022/09/23 21:20:32][INFO] // Execute: /home/helixbot/work/B2030979/p/dotnet-mono/shared/Microsoft.NETCore.App/7b7a15c3-9555-44f0-a930-533a9f2547dc/corerun 3b989c30-9124-438b-b4b7-cd4fd717963a.dll --benchmarkName System.Threading.Tests.Perf_Timer.AsynchronousContention --job "PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1" --benchmarkId 150 in /home/helixbot/work/B2030979/w/BDB30A21/e/performance/artifacts/bin/MicroBenchmarks/Release/net7.0/3b989c30-9124-438b-b4b7-cd4fd717963a/bin/Release/net7.0/publish [2022/09/23 21:20:32][INFO] Failed to set up high priority. Make sure you have the right permissions. Message: Permission denied [2022/09/23 21:20:32][INFO] // BeforeAnythingElse [2022/09/23 21:20:32][INFO] [2022/09/23 21:20:32][INFO] // Benchmark Process Environment Information: [2022/09/23 21:20:32][INFO] // Runtime=.NET 8.0.0 (42.42.42.42424), Arm64 AOT [2022/09/23 21:20:32][INFO] // GC=Non-concurrent Workstation [2022/09/23 21:20:32][INFO] // HardwareIntrinsics= [2022/09/23 21:20:33][INFO] // Job: Job-GJKJNQ(PowerPlanMode=00000000-0000-0000-0000-000000000000, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1) [2022/09/23 21:20:33][INFO] [2022/09/23 21:20:33][INFO] OverheadJitting 1: 1 op, 77444.00 ns, 77.4440 us/op [2022/09/23 21:23:32][INFO] WorkloadJitting 1: 1 op, 179874620266.00 ns, 179.8746 s/op [2022/09/23 21:23:32][INFO] [2022/09/23 21:26:20][INFO] WorkloadWarmup 1: 1 op, 167329988104.00 ns, 167.3300 s/op [2022/09/23 21:26:20][INFO] [2022/09/23 21:26:20][INFO] // BeforeActualRun [2022/09/23 21:29:07][INFO] WorkloadActual 1: 1 op, 167549719942.00 ns, 167.5497 s/op [2022/09/23 21:31:55][INFO] WorkloadActual 2: 1 op, 167296143533.00 ns, 167.2961 s/op [2022/09/23 21:34:42][INFO] WorkloadActual 3: 1 op, 167278742826.00 ns, 167.2787 s/op [2022/09/23 21:37:29][INFO] WorkloadActual 4: 1 op, 167452413521.00 ns, 167.4524 s/op [2022/09/23 21:40:17][INFO] WorkloadActual 5: 1 op, 167320729619.00 ns, 167.3207 s/op [2022/09/23 21:43:04][INFO] WorkloadActual 6: 1 op, 167412709896.00 ns, 167.4127 s/op [2022/09/23 21:45:52][INFO] WorkloadActual 7: 1 op, 167387006109.00 ns, 167.3870 s/op [2022/09/23 21:48:39][INFO] WorkloadActual 8: 1 op, 167426540496.00 ns, 167.4265 s/op [2022/09/23 21:51:26][INFO] WorkloadActual 9: 1 op, 167294036599.00 ns, 167.2940 s/op [2022/09/23 21:54:14][INFO] WorkloadActual 10: 1 op, 167317449832.00 ns, 167.3174 s/op [2022/09/23 21:57:01][INFO] WorkloadActual 11: 1 op, 167463292190.00 ns, 167.4633 s/op [2022/09/23 21:59:49][INFO] WorkloadActual 12: 1 op, 167331924572.00 ns, 167.3319 s/op [2022/09/23 22:02:36][INFO] WorkloadActual 13: 1 op, 167407704498.00 ns, 167.4077 s/op [2022/09/23 22:05:23][INFO] WorkloadActual 14: 1 op, 167382575539.00 ns, 167.3826 s/op [2022/09/23 22:08:11][INFO] WorkloadActual 15: 1 op, 167165982642.00 ns, 167.1660 s/op [2022/09/23 22:08:11][INFO] [2022/09/23 22:08:11][INFO] // AfterActualRun [2022/09/23 22:10:58][INFO] WorkloadResult 1: 1 op, 167549719942.00 ns, 167.5497 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 2: 1 op, 167296143533.00 ns, 167.2961 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 3: 1 op, 167278742826.00 ns, 167.2787 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 4: 1 op, 167452413521.00 ns, 167.4524 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 5: 1 op, 167320729619.00 ns, 167.3207 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 6: 1 op, 167412709896.00 ns, 167.4127 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 7: 1 op, 167387006109.00 ns, 167.3870 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 8: 1 op, 167426540496.00 ns, 167.4265 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 9: 1 op, 167294036599.00 ns, 167.2940 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 10: 1 op, 167317449832.00 ns, 167.3174 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 11: 1 op, 167463292190.00 ns, 167.4633 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 12: 1 op, 167331924572.00 ns, 167.3319 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 13: 1 op, 167407704498.00 ns, 167.4077 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 14: 1 op, 167382575539.00 ns, 167.3826 s/op [2022/09/23 22:10:58][INFO] WorkloadResult 15: 1 op, 167165982642.00 ns, 167.1660 s/op [2022/09/23 22:10:58][INFO] GC: 3569 1 1 14720017296 1 [2022/09/23 22:10:58][INFO] Threading: 46000046 961262 1 [2022/09/23 22:10:58][INFO] [2022/09/23 22:10:58][INFO] // AfterAll [2022/09/23 22:10:58][INFO] // Benchmark Process 3194 has exited with code 0. [2022/09/23 22:10:58][INFO] [2022/09/23 22:10:58][INFO] Mean = 167.366 s, StdErr = 0.024 s (0.01%), N = 15, StdDev = 0.093 s [2022/09/23 22:10:58][INFO] Min = 167.166 s, Q1 = 167.307 s, Median = 167.383 s, Q3 = 167.420 s, Max = 167.550 s [2022/09/23 22:10:58][INFO] IQR = 0.113 s, LowerFence = 167.138 s, UpperFence = 167.589 s [2022/09/23 22:10:58][INFO] ConfidenceInterval = [167.266 s; 167.466 s] (CI 99.9%), Margin = 0.100 s (0.06% of Mean) [2022/09/23 22:10:58][INFO] Skewness = -0.11, Kurtosis = 2.65, MValue = 2 ``` Similar behavior is for `Perf_Timer.SynchronousContention`.
Author: cincuranet
Assignees: -
Labels: `tenet-performance`, `tenet-performance-benchmarks`, `untriaged`, `area-Codegen-Interpreter-mono`
Milestone: -
kotlarmilos commented 8 months ago

Is this issue still relevant?

cincuranet commented 8 months ago

It is still very slow. But we also ramped up timeouts so it finishes nowadays.

Recent run:

[2024/01/02 22:12:56][INFO] // Benchmark: Perf_Timer.SynchronousContention: Job-CLJGNG(PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2024/01/02 22:12:56][INFO] // *** Execute ***
[2024/01/02 22:12:56][INFO] // Launch: 1 / 1
[2024/01/02 22:12:56][INFO] // Execute: /home/helixbot/work/A33B0927/p/dotnet-mono/shared/Microsoft.NETCore.App/9c7003c2-5923-4365-808a-511761d019fa/corerun Job-CLJGNG.dll --anonymousPipes 243 244 --benchmarkName System.Threading.Tests.Perf_Timer.SynchronousContention --job "PowerPlanMode=00000000-0000-0000-0000-000000000000, Toolchain=CoreRun, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1" --benchmarkId 197 in /home/helixbot/work/A33B0927/w/AB87093D/e/performance/artifacts/bin/MicroBenchmarks/Release/net9.0/Job-CLJGNG/bin/Release/net9.0/publish
[2024/01/02 22:12:56][INFO] // Failed to set up high priority (Permission denied). In order to run benchmarks with high priority, make sure you have the right permissions.
[2024/01/02 22:12:56][INFO] // BeforeAnythingElse
[2024/01/02 22:12:56][INFO] 
[2024/01/02 22:12:56][INFO] // Benchmark Process Environment Information:
[2024/01/02 22:12:56][INFO] // BenchmarkDotNet v0.13.11-nightly.20231126.107
[2024/01/02 22:12:56][INFO] // Runtime=.NET 9.0.0 (42.42.42.42424) using MonoVM, Arm64 AOT
[2024/01/02 22:12:56][INFO] // GC=Non-concurrent Workstation
[2024/01/02 22:12:56][INFO] // HardwareIntrinsics=
[2024/01/02 22:12:56][INFO] // Job: Job-DFDKRN(PowerPlanMode=00000000-0000-0000-0000-000000000000, IterationTime=250.0000 ms, MaxIterationCount=20, MinIterationCount=15, WarmupCount=1)
[2024/01/02 22:12:56][INFO] 
[2024/01/02 22:12:56][INFO] OverheadJitting  1: 1 op, 34560.00 ns, 34.5600 us/op
[2024/01/02 22:20:16][INFO] WorkloadJitting  1: 1 op, 440022687380.00 ns, 440.0227 s/op
[2024/01/02 22:20:16][INFO] 
[2024/01/02 22:27:31][INFO] WorkloadWarmup   1: 1 op, 435051974977.00 ns, 435.0520 s/op
[2024/01/02 22:27:31][INFO] 
[2024/01/02 22:27:31][INFO] // BeforeActualRun
[2024/01/02 22:34:47][INFO] WorkloadActual   1: 1 op, 435324910483.00 ns, 435.3249 s/op
[2024/01/02 22:42:02][INFO] WorkloadActual   2: 1 op, 435911093850.00 ns, 435.9111 s/op
[2024/01/02 22:49:18][INFO] WorkloadActual   3: 1 op, 435123083406.00 ns, 435.1231 s/op
[2024/01/02 22:56:34][INFO] WorkloadActual   4: 1 op, 436278624259.00 ns, 436.2786 s/op
[2024/01/02 23:03:50][INFO] WorkloadActual   5: 1 op, 436325228561.00 ns, 436.3252 s/op
[2024/01/02 23:11:07][INFO] WorkloadActual   6: 1 op, 436417129879.00 ns, 436.4171 s/op
[2024/01/02 23:18:22][INFO] WorkloadActual   7: 1 op, 435634211369.00 ns, 435.6342 s/op
[2024/01/02 23:25:38][INFO] WorkloadActual   8: 1 op, 436053297849.00 ns, 436.0533 s/op
[2024/01/02 23:32:53][INFO] WorkloadActual   9: 1 op, 435081302030.00 ns, 435.0813 s/op
[2024/01/02 23:40:09][INFO] WorkloadActual  10: 1 op, 435165339859.00 ns, 435.1653 s/op
[2024/01/02 23:47:25][INFO] WorkloadActual  11: 1 op, 436471928449.00 ns, 436.4719 s/op
[2024/01/02 23:54:41][INFO] WorkloadActual  12: 1 op, 435597729415.00 ns, 435.5977 s/op
[2024/01/03 00:01:57][INFO] WorkloadActual  13: 1 op, 436273359150.00 ns, 436.2734 s/op
[2024/01/03 00:09:14][INFO] WorkloadActual  14: 1 op, 437006510767.00 ns, 437.0065 s/op
[2024/01/03 00:16:31][INFO] WorkloadActual  15: 1 op, 436689159219.00 ns, 436.6892 s/op
[2024/01/03 00:16:31][INFO] 
[2024/01/03 00:16:31][INFO] // AfterActualRun
[2024/01/03 00:23:47][INFO] WorkloadResult   1: 1 op, 435324910483.00 ns, 435.3249 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   2: 1 op, 435911093850.00 ns, 435.9111 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   3: 1 op, 435123083406.00 ns, 435.1231 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   4: 1 op, 436278624259.00 ns, 436.2786 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   5: 1 op, 436325228561.00 ns, 436.3252 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   6: 1 op, 436417129879.00 ns, 436.4171 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   7: 1 op, 435634211369.00 ns, 435.6342 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   8: 1 op, 436053297849.00 ns, 436.0533 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult   9: 1 op, 435081302030.00 ns, 435.0813 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  10: 1 op, 435165339859.00 ns, 435.1653 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  11: 1 op, 436471928449.00 ns, 436.4719 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  12: 1 op, 435597729415.00 ns, 435.5977 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  13: 1 op, 436273359150.00 ns, 436.2734 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  14: 1 op, 437006510767.00 ns, 437.0065 s/op
[2024/01/03 00:23:47][INFO] WorkloadResult  15: 1 op, 436689159219.00 ns, 436.6892 s/op
[2024/01/03 00:23:47][INFO] // GC:  3011 1 1 12160027808 1
[2024/01/03 00:23:47][INFO] // Threading:  80 18 1
[2024/01/03 00:23:47][INFO] 
[2024/01/03 00:23:47][INFO] // AfterAll
[2024/01/03 00:23:47][INFO] // Benchmark Process 1134683 has exited with code 0.
[2024/01/03 00:23:47][INFO] 
[2024/01/03 00:23:47][INFO] Mean = 435.957 s, StdErr = 0.157 s (0.04%), N = 15, StdDev = 0.609 s
[2024/01/03 00:23:47][INFO] Min = 435.081 s, Q1 = 435.461 s, Median = 436.053 s, Q3 = 436.371 s, Max = 437.007 s
[2024/01/03 00:23:47][INFO] IQR = 0.910 s, LowerFence = 434.097 s, UpperFence = 437.736 s
[2024/01/03 00:23:47][INFO] ConfidenceInterval = [435.306 s; 436.608 s] (CI 99.9%), Margin = 0.651 s (0.15% of Mean)
[2024/01/03 00:23:47][INFO] Skewness = -0.05, Kurtosis = 1.61, MValue = 2
kotlarmilos commented 8 months ago

Ok, thanks. I will try to reproduce it again.

kotlarmilos commented 1 month ago

@cincuranet Is this still relevant?

cincuranet commented 1 month ago

@sblom @LoopedBard3 @caaavik-msft @DrewScoggins PTAL

LoopedBard3 commented 1 month ago
This does still seem to be a Scenario with much higher run times. This is the average result of the last 30 days for the AsynchronousContention test sliced by runconfig and queue, and sorted slowest to fastest: RunConfiguration Queue Result (sec)
{"CompilationMode":"tiered","RunKind":"micro_mono","LLVM":"false","MonoInterpreter":"true","MonoAOT":"false"} Ubuntu.2204.Arm64.Perf 442.71767679941189
{"CompilationMode":"Tiered","RunKind":"micro","PGOType":"nodynamicpgo"} Windows.Server.Arm64.Perf 331.45373260975771
{"CompilationMode":"tiered","RunKind":"micro","R2RType":"nor2r"} Ubuntu.2204.Arm64.Perf 329.68414371355146
{"CompilationMode":"tiered","RunKind":"micro","PGOType":"nodynamicpgo"} Ubuntu.2204.Arm64.Perf 329.42461407338243
{"CompilationMode":"tiered","RunKind":"micro"} Ubuntu.2204.Arm64.Perf 329.14762180539196
{"CompilationMode":"Tiered","RunKind":"micro","R2RType":"nor2r"} Windows.Server.Arm64.Perf 327.97486255683566
{"CompilationMode":"Tiered","RunKind":"micro"} Windows.Server.Arm64.Perf 326.56740018672349
{"CompilationMode":"Tiered","RunKind":"micro","ExperimentName":"jitoptrepeat"} Windows.11.Amd64.Viper.Perf 20.132713217645968
{"CompilationMode":"Tiered","RunKind":"micro"} Windows.11.Amd64.Viper.Perf 19.218769350140207
{"CompilationMode":"tiered","RunKind":"micro","ExperimentName":"jitoptrepeat"} Ubuntu.2204.Amd64.Viper.Perf 18.377636735868517
{"CompilationMode":"tiered","RunKind":"micro"} Ubuntu.2204.Amd64.Viper.Perf 18.336005699945044
{"CompilationMode":"tiered","RunKind":"micro_mono","LLVM":"false","MonoInterpreter":"true","MonoAOT":"false"} Ubuntu.2204.Amd64.Tiger.Perf 10.494778979082861
{"CompilationMode":"tiered","RunKind":"micro_mono","LLVM":"false","MonoInterpreter":"false","MonoAOT":"false"} Ubuntu.2204.Amd64.Tiger.Perf 8.361800168675801
{"CompilationMode":"Tiered","RunKind":"micro","ExperimentName":"rlcse"} Windows.11.Amd64.Owl.Perf 5.4697047239147043
{"CompilationMode":"Tiered","RunKind":"micro"} Windows.11.Amd64.Owl.Perf 5.4363181450833817
{"CompilationMode":"tiered","RunKind":"micro"} Ubuntu.2204.Amd64.Owl.Perf 4.7835471147354074
{"CompilationMode":"tiered","RunKind":"micro","ExperimentName":"rlcse"} Ubuntu.2204.Amd64.Owl.Perf 4.664173323838698
{"CompilationMode":"Tiered","RunKind":"micro","R2RType":"nor2r"} Windows.11.Amd64.Tiger.Perf 3.351061666715089
{"CompilationMode":"Tiered","RunKind":"micro"} Windows.11.Amd64.Tiger.Perf 3.281750910996307
{"CompilationMode":"Tiered","RunKind":"micro","PGOType":"nodynamicpgo"} Windows.11.Amd64.Tiger.Perf 3.1858667317420855
{"CompilationMode":"Tiered","RunKind":"micro"} Windows.11.Arm64.Surf.Perf 2.0885703081562714
{"CompilationMode":"tiered","RunKind":"micro","R2RType":"nor2r"} Ubuntu.2204.Amd64.Tiger.Perf 1.9943783126421664
{"CompilationMode":"tiered","RunKind":"micro"} Ubuntu.2204.Amd64.Tiger.Perf 1.9633307239576763
{"CompilationMode":"tiered","RunKind":"micro"} alpine.amd64.tiger.perf 1.9575610596830739
{"CompilationMode":"tiered","RunKind":"micro","PGOType":"nodynamicpgo"} Ubuntu.2204.Amd64.Tiger.Perf 1.9388397557867425
{"CompilationMode":"tiered","RunKind":"micro"} Ubuntu.2204.Amd64 0.60148117751757435

I think it is clear that Arm64 is still substantially slower than other architectures resulting in longer test runtime.