The goal of this study was simple: ensure that we ship .NET 6 without any performance regressions, especially in the configs not covered by .NET Performance Lab.
Windows: 7, 8.1, 10, 11, Server 2022, Server 2022 Core
| Operating System | Bit | Processor Name | Comment |
| ----------------------- | ----- | ----------------------------------------------- |------------------------------|
| Windows 10.0.19043.1165 | X64 | AMD Ryzen Threadripper PRO 3945WX 12-Cores | |
| Windows 10.0.20348 | X64 | AMD EPYC 7452 | Windows Server 2022, VM |
| Windows 10.0.20348 | X64 | AMD EPYC 7452 | Windows Server 2022 Core, VM |
| Windows 10.0.18363.1621 | X64 | Intel Xeon CPU E5-1650 v4 3.60GHz | |
| Windows 8.1 | X64 | Intel Core i7-3610QM CPU 2.30GHz (Ivy Bridge) | |
| Windows 10.0.19042.685 | X64 | Intel Core i7-5557U CPU 3.10GHz (Broadwell) | |
| Windows 10.0.19043.1165 | X64 | Intel Core i7-6700 CPU 3.40GHz (Skylake) | |
| Windows 10.0.22454 | X64 | Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R) | |
| Windows 10.0.22451 | X64 | Intel Core i7-8700 CPU 3.20GHz (Coffee Lake) | |
| Windows 10.0.19042.1165 | X64 | Intel Core i9-9900T CPU 2.10GHz | |
| Windows 7 SP1 | X64 | Intel Core2 Duo CPU T9600 2.80GHz | ancient hardware |
| centos 8 | X64 | AMD EPYC 7452 | VM |
| debian 10 | X64 | AMD EPYC 7452 | VM |
| rhel 7 | X64 | AMD EPYC 7452 | VM |
| sles 15 | X64 | AMD EPYC 7452 | VM |
| opensuse-leap 15.3 | X64 | AMD EPYC 7452 | VM |
| ubuntu 18.04 | X64 | Intel Xeon CPU E5-1650 v4 3.60GHz | |
| ubuntu 18.04 | X64 | Intel Core i7-2720QM CPU 2.20GHz (Sandy Bridge) | |
| alpine 3.13 | X64 | Intel Core i7-7700 CPU 3.60GHz (Kaby Lake) | |
| ubuntu 16.04 | Arm64 | Qualcomm Centriq | |
| Windows 10.0.19043.1165 | Arm64 | Microsoft SQ1 3.0 GHz | |
| Windows 10.0.22000 | Arm64 | Microsoft SQ1 3.0 GHz | |
| Windows 10.0.19043.1165 | X86 | AMD Ryzen Threadripper PRO 3945WX 12-Cores | |
| Windows 10.0.18363.1621 | X86 | Intel Xeon CPU E5-1650 v4 3.60GHz | |
| Windows 10.0.19043.1165 | Arm | Microsoft SQ1 3.0 GHz | |
| macOS Big Sur 11.5.2 | X64 | Intel Core i5-4278U CPU 2.60GHz (Haswell) | |
| macOS Big Sur 11.5.2 | X64 | Intel Core i7-4870HQ CPU 2.50GHz (Haswell) | |
| macOS Big Sur 11.4 | X64 | Intel Core i7-5557U CPU 3.10GHz (Broadwell) | |
Most of the benchmarks were run on bare-metal machines, but some were executed on Azure VMs.
This would not be possible without the help from: @AndyAyersMS @BruceForstall @bwadswor @carlossanlop @danmoseley @jeffhandley @michaelgsharp @sharwell @smitpatel @vatsan-madhavan @wfurt who contributed their results and time.
Everyone interested can download the data from here and here (GitHub does not support files larger than 100 MB so I had to split .NET 5 and 6 results into two separate archives). The full report generated by the tool is available here. The full report contains also improvements, so if you read it from the end you can see the biggest perf improvements.
Moreover, the full historical data which again turned out to be extremely useful is available here.
The goal of this study was simple: ensure that we ship .NET 6 without any performance regressions, especially in the configs not covered by .NET Performance Lab.
We have not changed the methodology since last year, so if you are interested in details about methodology please read https://github.com/dotnet/runtime/issues/41871.
Data
This year, we have covered more configs than ever! Namely:
Most of the benchmarks were run on bare-metal machines, but some were executed on Azure VMs.
This would not be possible without the help from: @AndyAyersMS @BruceForstall @bwadswor @carlossanlop @danmoseley @jeffhandley @michaelgsharp @sharwell @smitpatel @vatsan-madhavan @wfurt who contributed their results and time.
Everyone interested can download the data from here and here (GitHub does not support files larger than 100 MB so I had to split .NET 5 and 6 results into two separate archives). The full report generated by the tool is available here. The full report contains also improvements, so if you read it from the end you can see the biggest perf improvements.
Moreover, the full historical data which again turned out to be extremely useful is available here.
Regressions
By design
System.Memory.Span<Byte>.IndexOfAnyFourValues(Size: 512)
,System.Memory.Span<Int32>.IndexOfAnyFourValues(Size: 512)
:System.Linq.Tests.Perf_Enumerable.TakeLastHalf(input: List)
System.Collections.Concurrent.AddRemoveFromSameThreads*
System.Tests.Perf_Random.NextDouble
,System.Tests.Perf_Random.Next_int
,System.Tests.Perf_Random.Next_int_int
,System.Tests.Perf_Random.NextBytes_span
System.IO.Tests.BinaryWriterExtendedTests.WriteAsciiCharArray(StringLengthInChars: 32)
System.IO.Tests.Perf_FileStream.ReadAsync(fileSize: 1024, userBufferSize: 1024, options: Asynchronous)
FileStream
being 100% async nowInvestigation in progress
System.Numerics.Tests.Perf_Matrix3x2.IsIdentityBenchmark
System.Numerics.Tests.Perf_Vector3.DistanceBenchmark
,System.Numerics.Tests.Perf_Vector2.DistanceBenchmark
System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, OrdinalIgnoreCase))
,System.Globalization.Tests.StringEquality.Compare_DifferentFirstChar(Count: 1024, Options: (en-US, Ordinal))
,System.Buffers.Text.Tests.Utf8FormatterTests.FormatterInt64(value: 12345)
,System.Tests.Perf_Int32.ToStringHex(value: 2147483647)
,System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, Ordinal))
System.Collections.ContainsKeyFalse<Int32, Int32>.SortedList(Size: 512)
System.Collections.ContainsKeyFalse<Int32, Int32>.ConcurrentDictionary(Size: 512)
System.Text.Json.Serialization.Tests.ReadJson<Int32>.DeserializeFromStream
System.Text.Tests.Perf_StringBuilder.ctor_capacity(length: 100000)
,System.Text.Tests.Perf_StringBuilder.ToString_MultipleSegments(length: 100000)
andSystem.Text.Tests.Perf_StringBuilder.ctor_string(length: 100000)
System.Collections.CtorDefaultSize<Int32>.ConcurrentBag
System.Text.Perf_Utf8Encoding.GetBytes(Input: Cyrillic)
System.Collections.Sort<IntClass>.Array(Size: 512)
System.Threading.Tests.Perf_Timer.ShortScheduleAndDisposeWithFiringTimers
PerfLabTests.DelegatePerf.DelegateInvoke
https://github.com/dotnet/runtime/issues/59152
, macOS-specificMicrosoft.Extensions.Logging.ScopesOverheadBenchmark.FilteredByLevel_InsideScope(HasISupportLoggingScopeLogger: False, CaptureScopes: True)
None of the regressions reported above is critical, but in my opinion, we should have a good understanding of https://github.com/dotnet/runtime/issues/59145 before we ship .NET 6.
Noise, flaky or multimodal
The following benchmarks showed up in the report generated by the tool, but were not actual regressions:
System.Collections.CopyTo<Int32>*
System.Net.Primitives.Tests.IPAddressPerformanceTests.Ctor_Span(address: [16, 32, 48, 64, 80, ...])
System.Buffers.Tests.ReadOnlySequenceTests<Byte>.SliceTenSegments
System.Numerics.Tests.Perf_Matrix4x4.CreateReflectionBenchmark
System.Memory.Span<Int32>.EndsWith(Size: 512)
System.Memory.Span<Int32>.BinarySearch(Size: 512)
(https://github.com/dotnet/runtime/issues/56402)PerfLabTests.CastingPerf.CheckArrayIsVariantGenericInterfaceNo
System.Memory.ReadOnlySpan.IndexOfString(input: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAXAAAAAAAAAAAAAAAAAAAAAAAAAAAAA", value: "x", comparisonType: InvariantCultureIgnoreCase)
System.Net.Security.Tests.SslStreamTests.ConcurrentReadWrite
PerfLabTests.DelegatePerf.MulticastDelegateInvoke(length: 1000)
System.Numerics.Tests.Perf_BitOperations.PopCount_uint
,System.Numerics.Tests.Perf_BitOperations.LeadingZeroCount_uint
: memory alignmentBig thanks to everyone involved!