Closed wilbit closed 4 months ago
Here are some results of #3487
Interesting observations
foreach
under net8.0 works faster by itself, as is promised by dotnet teamBenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3085/23H2/2023Update/SunValley3) AMD Ryzen 7 7840HS w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores .NET SDK 8.0.101 [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET 6.0 : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2 .NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET Framework 4.6.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 .NET Framework 4.8 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 | Method | Runtime | Count | Mean | Ratio | Gen0 | Allocated | Alloc Ratio | |---------- |--------------------- |------ |---------------:|------:|--------:|----------:|------------:| | Original | .NET 6.0 | 0 | 9.3141 ns | 1.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 0 | 0.4081 ns | 0.04 | - | - | 0.00 | | | | | | | | | | | Original | .NET 8.0 | 0 | 5.7639 ns | 1.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 0 | 1.0110 ns | 0.18 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.6.1 | 0 | 9.5384 ns | 1.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 0 | 4.1706 ns | 0.44 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.8 | 0 | 9.2218 ns | 1.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 0 | 4.1541 ns | 0.45 | - | - | 0.00 | | | | | | | | | | | Original | .NET 6.0 | 1 | 17.0112 ns | 1.00 | 0.0076 | 64 B | 1.00 | | Optimized | .NET 6.0 | 1 | 0.4041 ns | 0.02 | - | - | 0.00 | | | | | | | | | | | Original | .NET 8.0 | 1 | 10.1169 ns | 1.00 | 0.0076 | 64 B | 1.00 | | Optimized | .NET 8.0 | 1 | 0.8015 ns | 0.08 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.6.1 | 1 | 16.5842 ns | 1.00 | 0.0102 | 64 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 1 | 9.2648 ns | 0.56 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.8 | 1 | 16.0324 ns | 1.00 | 0.0102 | 64 B | 1.00 | | Optimized | .NET Framework 4.8 | 1 | 9.2660 ns | 0.58 | - | - | 0.00 | | | | | | | | | | | Original | .NET 6.0 | 100 | 863.0613 ns | 1.00 | 0.2909 | 2440 B | 1.00 | | Optimized | .NET 6.0 | 100 | 94.1143 ns | 0.11 | - | - | 0.00 | | | | | | | | | | | Original | .NET 8.0 | 100 | 358.8725 ns | 1.00 | 0.2913 | 2440 B | 1.00 | | Optimized | .NET 8.0 | 100 | 87.2346 ns | 0.24 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.6.1 | 100 | 860.7073 ns | 1.00 | 0.3881 | 2447 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 100 | 670.9248 ns | 0.78 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.8 | 100 | 858.8265 ns | 1.00 | 0.3881 | 2447 B | 1.00 | | Optimized | .NET Framework 4.8 | 100 | 673.0063 ns | 0.78 | - | - | 0.00 | | | | | | | | | | | Original | .NET 6.0 | 10000 | 87,576.0531 ns | 1.00 | 28.6865 | 240040 B | 1.00 | | Optimized | .NET 6.0 | 10000 | 12,056.4313 ns | 0.14 | - | - | 0.00 | | | | | | | | | | | Original | .NET 8.0 | 10000 | 47,547.0036 ns | 1.00 | 28.6865 | 240040 B | 1.00 | | Optimized | .NET 8.0 | 10000 | 11,860.9009 ns | 0.25 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.6.1 | 10000 | 86,953.4481 ns | 1.00 | 38.2080 | 240747 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 10000 | 67,876.3753 ns | 0.78 | - | - | 0.00 | | | | | | | | | | | Original | .NET Framework 4.8 | 10000 | 87,112.9826 ns | 1.00 | 38.2080 | 240747 B | 1.00 | | Optimized | .NET Framework 4.8 | 10000 | 67,801.7188 ns | 0.78 | - | - | 0.00 | // * Legends * Count : Value of the 'Count' parameter Mean : Arithmetic mean of all measurements Error : Half of 99.9% confidence interval StdDev : Standard deviation of all measurements Ratio : Mean of the ratio distribution ([Current]/[Baseline]) Gen0 : GC Generation 0 collects per 1000 operations Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B) Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline]) 1 ns : 1 Nanosecond (0.000000001 sec)
Interesting observations
foreach
under net8.0 works faster by itself, as is promised by dotnet teamBenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3085/23H2/2023Update/SunValley3) AMD Ryzen 7 7840HS w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores .NET SDK 8.0.101 [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET 6.0 : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2 .NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET Framework 4.6.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 .NET Framework 4.8 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 | Method | Runtime | Count | Mean | Ratio | RatioSD | Gen0 | Allocated | Alloc Ratio | |---------- |--------------------- |------ |---------------:|------:|--------:|-------:|----------:|------------:| | Original | .NET 6.0 | 0 | 8.0176 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 0 | 0.2195 ns | 0.03 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 0 | 4.3865 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 0 | 1.0815 ns | 0.25 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 0 | 8.5412 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 0 | 4.1479 ns | 0.49 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 0 | 8.3359 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 0 | 4.1463 ns | 0.50 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 1 | 12.2678 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 1 | 0.3986 ns | 0.03 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 1 | 6.1827 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 1 | 0.7901 ns | 0.13 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 1 | 12.1471 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 1 | 6.5324 ns | 0.54 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 1 | 12.1513 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 1 | 6.3340 ns | 0.52 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 100 | 409.0528 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 100 | 88.2895 ns | 0.22 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 100 | 131.1248 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 100 | 106.0501 ns | 0.81 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 100 | 359.8695 ns | 1.00 | 0.00 | 0.0062 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 100 | 368.7890 ns | 1.02 | 0.02 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 100 | 395.9206 ns | 1.00 | 0.00 | 0.0062 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 100 | 328.0021 ns | 0.83 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 10000 | 39,363.4018 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET 6.0 | 10000 | 11,430.6216 ns | 0.30 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 10000 | 13,431.0359 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET 8.0 | 10000 | 12,180.5906 ns | 0.91 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 10000 | 35,915.5304 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 10000 | 37,104.0580 ns | 1.03 | 0.02 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 10000 | 38,373.6949 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 10000 | 33,278.6272 ns | 0.87 | 0.02 | - | - | 0.00 | // * Legends * Count : Value of the 'Count' parameter Mean : Arithmetic mean of all measurements Error : Half of 99.9% confidence interval StdDev : Standard deviation of all measurements Ratio : Mean of the ratio distribution ([Current]/[Baseline]) RatioSD : Standard deviation of the ratio distribution ([Current]/[Baseline]) Gen0 : GC Generation 0 collects per 1000 operations Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B) Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline]) 1 ns : 1 Nanosecond (0.000000001 sec)
Interesting observations
foreach
under net8.0 works faster by itself, as is promised by dotnet teamBenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3085/23H2/2023Update/SunValley3) AMD Ryzen 7 7840HS w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores .NET SDK 8.0.101 [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET 6.0 : .NET 6.0.26 (6.0.2623.60508), X64 RyuJIT AVX2 .NET 8.0 : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI .NET Framework 4.6.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 .NET Framework 4.8 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256 | Method | Runtime | Count | Mean | Ratio | RatioSD | Gen0 | Allocated | Alloc Ratio | |---------- |--------------------- |------ |---------------:|------:|--------:|-------:|----------:|------------:| | Original | .NET 6.0 | 0 | 7.9846 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 0 | 0.2054 ns | 0.03 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 0 | 4.1603 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 0 | 1.0892 ns | 0.26 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 0 | 8.1432 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 0 | 4.1538 ns | 0.51 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 0 | 8.1652 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 0 | 4.1680 ns | 0.51 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 1 | 12.8088 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 1 | 0.4319 ns | 0.03 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 1 | 6.1281 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 1 | 0.7996 ns | 0.13 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 1 | 12.2347 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 1 | 6.4721 ns | 0.53 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 1 | 12.1104 ns | 1.00 | 0.00 | 0.0064 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 1 | 6.3979 ns | 0.53 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 100 | 420.2075 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 6.0 | 100 | 88.1461 ns | 0.21 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 100 | 130.6645 ns | 1.00 | 0.00 | 0.0048 | 40 B | 1.00 | | Optimized | .NET 8.0 | 100 | 106.1793 ns | 0.81 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 100 | 363.0120 ns | 1.00 | 0.00 | 0.0062 | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 100 | 367.5376 ns | 1.01 | 0.02 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 100 | 407.9413 ns | 1.00 | 0.00 | 0.0062 | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 100 | 366.7885 ns | 0.90 | 0.02 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 6.0 | 10000 | 39,881.8502 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET 6.0 | 10000 | 11,197.6319 ns | 0.28 | 0.00 | - | - | 0.00 | | | | | | | | | | | | Original | .NET 8.0 | 10000 | 14,252.0129 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET 8.0 | 10000 | 12,354.0055 ns | 0.87 | 0.01 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.6.1 | 10000 | 36,516.3600 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET Framework 4.6.1 | 10000 | 37,105.5062 ns | 1.02 | 0.03 | - | - | 0.00 | | | | | | | | | | | | Original | .NET Framework 4.8 | 10000 | 39,317.8849 ns | 1.00 | 0.00 | - | 40 B | 1.00 | | Optimized | .NET Framework 4.8 | 10000 | 37,042.1739 ns | 0.94 | 0.00 | - | - | 0.00 | // * Legends * Count : Value of the 'Count' parameter Mean : Arithmetic mean of all measurements Error : Half of 99.9% confidence interval StdDev : Standard deviation of all measurements Ratio : Mean of the ratio distribution ([Current]/[Baseline]) RatioSD : Standard deviation of the ratio distribution ([Current]/[Baseline]) Gen0 : GC Generation 0 collects per 1000 operations Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B) Alloc Ratio : Allocated memory ratio distribution ([Current]/[Baseline]) 1 ns : 1 Nanosecond (0.000000001 sec)
Because of an efficient implementation of Enumerable pattern. It creates a significant pressure to GC.
Most of the traffic come from boxed
KeyValuePair<CollectionEntry, IPersistentCollection>
values, but Enumerator also could be made astruct
for removing allocations further.During a startup our application caches a lot of data. So, the image below is for our case.![image](https://github.com/nhibernate/nhibernate-core/assets/687620/c5050e42-caac-4f21-9b9f-8e526f04594f)
I work on a fix, check my solution against original using DotNetBenchmark.