Open adamsitnik opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.
Author: | adamsitnik |
---|---|
Assignees: | - |
Labels: | `area-System.Runtime`, `os-mac-os-x`, `tenet-performance` |
Milestone: | - |
System.Tests.Perf_GC<Byte>.NewOperator_Array(length: 1000)
and few similar benchmarks has regressed as well.
Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.
Author: | adamsitnik |
---|---|
Assignees: | - |
Labels: | `os-mac-os-x`, `tenet-performance`, `area-GC-coreclr`, `untriaged` |
Milestone: | - |
The issue still persists with preview2, some benchmarks are even up to 5 times slower. It's specific to macOS x64.
Type | Method | Job | Runtime | Toolchain | Size | Mean | Error | StdDev | Median | Min | Max | Ratio | RatioSD | Gen 0 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CtorGivenSize<Int32> | Array | Job-YIPOBQ | .NET 6.0 | net6.0 | 512 | 196.4 ns | 4.55 ns | 5.24 ns | 193.7 ns | 190.9 ns | 209.7 ns | 1.00 | 0.00 | 0.9899 | 2.02 KB | 1.00 |
CtorGivenSize<Int32> | Array | Job-FCGZAV | .NET 7.0 | net7.0 | 512 | 1,062.0 ns | 71.56 ns | 82.40 ns | 1,005.3 ns | 992.6 ns | 1,224.5 ns | 5.40 | 0.34 | 12.6558 | 2.02 KB | 1.00 |
CtorGivenSize<String> | Array | Job-YIPOBQ | .NET 6.0 | net6.0 | 512 | 414.2 ns | 8.11 ns | 8.32 ns | 415.9 ns | 382.5 ns | 419.2 ns | 1.00 | 0.00 | 1.9677 | 4.02 KB | 1.00 |
CtorGivenSize<String> | Array | Job-FCGZAV | .NET 7.0 | net7.0 | 512 | 2,074.3 ns | 7.78 ns | 6.07 ns | 2,075.9 ns | 2,066.8 ns | 2,086.6 ns | 5.02 | 0.14 | 25.0000 | 4.02 KB | 1.00 |
thanks for reporting! @mokosan will be doing investigation on this.
Was able to repro this on my Mac after upgrading to Monterey. The pattern I am observing is that we have regressed for cases where we are not pinning and have improved for cases where we are pinning. Will be investigating this further. As a note, this issue regressed before we enabled Regions and is seemingly an OS specific regression.
BenchmarkDotNet=v0.13.1.1823-nightly, OS=macOS Monterey 12.3 (21E230) [Darwin 21.4.0] Intel Core i5-8210Y CPU 1.60GHz (Amber Lake Y), 1 CPU, 4 logical and 2 physical cores .NET SDK=7.0.100-rc.1.22409.23 [Host] : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT Job-WCHAYM : .NET 6.0.8 (6.0.822.36306), X64 RyuJIT
PowerPlanMode=00000000-0000-0000-0000-000000000000 IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1
Type | Method | length | pinned | Mean | Error | StdDev | Median | Min | Max | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Perf_GC |
AllocateUninitializedArray | 1000 | False | 116.9 ns | 16.03 ns | 17.15 ns | 109.6 ns | 98.29 ns | 148.8 ns | 0.4892 | - | - | 1 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | False | 191.5 ns | 5.27 ns | 5.64 ns | 191.1 ns | 181.34 ns | 204.6 ns | 0.9671 | - | - | 1.98 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | True | 524.4 ns | 68.27 ns | 75.88 ns | 519.6 ns | 401.84 ns | 672.0 ns | 1.9287 | 1.9287 | 0.3236 | 1 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | True | 724.6 ns | 31.27 ns | 36.01 ns | 736.8 ns | 604.18 ns | 762.8 ns | 3.8248 | 3.8248 | 0.6398 | 1.98 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | False | 595.0 ns | 23.05 ns | 25.62 ns | 591.3 ns | 541.17 ns | 638.5 ns | 4.7601 | - | - | 9.79 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | False | 1,048.9 ns | 37.94 ns | 43.69 ns | 1,047.5 ns | 926.95 ns | 1,106.3 ns | 9.5235 | - | - | 19.55 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | True | 2,625.9 ns | 267.35 ns | 286.07 ns | 2,653.5 ns | 1,862.02 ns | 2,958.7 ns | 18.7467 | 18.7467 | 3.1244 | 9.79 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | True | 5,790.2 ns | 185.26 ns | 205.92 ns | 5,788.5 ns | 5,389.56 ns | 6,071.2 ns | 36.6209 | 36.6209 | 6.1206 | 19.56 KB |
BenchmarkDotNet=v0.13.1.1823-nightly, OS=macOS Monterey 12.3 (21E230) [Darwin 21.4.0] Intel Core i5-8210Y CPU 1.60GHz (Amber Lake Y), 1 CPU, 4 logical and 2 physical cores .NET SDK=7.0.100-rc.1.22409.23 [Host] : .NET 7.0.0 (7.0.22.40308), X64 RyuJIT Job-LJYITB : .NET 7.0.0 (7.0.22.40308), X64 RyuJIT
PowerPlanMode=00000000-0000-0000-0000-000000000000 IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1
Type | Method | length | pinned | Mean | Error | StdDev | Median | Min | Max | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Perf_GC |
AllocateUninitializedArray | 1000 | False | 583.9 ns | 21.98 ns | 23.52 ns | 578.2 ns | 560.3 ns | 641.4 ns | 6.2490 | - | - | 1 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | False | 1,189.7 ns | 23.07 ns | 21.58 ns | 1,194.1 ns | 1,153.3 ns | 1,238.0 ns | 12.3433 | - | - | 1.98 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | True | 356.9 ns | 4.14 ns | 3.87 ns | 356.9 ns | 349.5 ns | 362.9 ns | 1.9396 | 1.9396 | 0.3244 | 1 KB |
Perf_GC |
AllocateUninitializedArray | 1000 | True | 535.5 ns | 17.27 ns | 19.89 ns | 533.8 ns | 505.7 ns | 575.0 ns | 3.8257 | 3.8257 | 0.6394 | 1.98 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | False | 5,174.0 ns | 100.51 ns | 98.72 ns | 5,149.5 ns | 5,060.3 ns | 5,379.8 ns | 57.1254 | 2.0353 | - | 9.79 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | False | 5,411.1 ns | 135.97 ns | 151.14 ns | 5,351.5 ns | 5,211.1 ns | 5,692.5 ns | 57.2404 | 3.8174 | - | 19.55 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | True | 2,052.9 ns | 45.34 ns | 52.22 ns | 2,058.8 ns | 1,974.3 ns | 2,177.5 ns | 18.7315 | 18.7315 | 3.1286 | 9.79 KB |
Perf_GC |
AllocateUninitializedArray | 10000 | True | 3,935.5 ns | 63.14 ns | 59.06 ns | 3,921.6 ns | 3,871.0 ns | 4,079.2 ns | 36.6549 | 36.6549 | 6.1220 | 19.56 KB |
Here are the results from the 6.0 vs 7.0-rc2 report, we will tag this as "Under investigation" in the report.
Here are the results from the 7.0 vs 6.0 report confirming the regression.
Closing in lieu of https://github.com/dotnet/runtime/issues/73592
Reopening to test this thoroughly on MacOS.
GC.AllocateUninitializedArray
benchmarks have regressed by 15% for "smaller size" (1000 elements) only on macOS (x64, I don't have arm64 data). Other Unixes are not affected and larger sizes (10000 elements) are not affected..Repro:
cc @VSadov