Open mgravell opened 6 years ago
I figured the best thing to do would be to spike it and measure the impact; so I whipped something together based on List<T>
and ran it through BenchmarkDotNet
; testing Add
in particular (respecting all the same features like version-tracking for enumerator breaking), and making sure to check for ObjectDisposedException
(rather than failing in more obscure ways) - over a range of final element sizes, for both "presized" (given the right length as a hint) and allowed to size itself automatically (no size hint).
Here's the gist, and here's the results (wow this took a long time to run!).
Conclusion: the only time it is even in doubt is for very small lists - pretty sure this could be optimized a little, too.
Method | Job | Runtime | Presized | FillCount | Mean | Error | StdDev | Median | Scaled | ScaledSD | Gen 0 | Gen 1 | Allocated |
------------------ |----- |-------- |--------- |---------- |------------:|------------:|------------:|------------:|-------:|---------:|-------:|-------:|----------:|
PopulateList | Clr | Clr | False | 0 | 11.49 ns | 0.2290 ns | 0.4520 ns | 11.42 ns | 1.00 | 0.00 | 0.0063 | - | 40 B |
PopulateArrayList | Clr | Clr | False | 0 | 20.24 ns | 0.4019 ns | 0.5634 ns | 20.18 ns | 1.76 | 0.08 | 0.0051 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 0 | 10.27 ns | 0.2044 ns | 0.3301 ns | 10.23 ns | 1.00 | 0.00 | 0.0063 | - | 40 B |
PopulateArrayList | Core | Core | False | 0 | 20.14 ns | 0.3893 ns | 0.7019 ns | 20.24 ns | 1.96 | 0.09 | 0.0051 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 5 | 77.81 ns | 1.5428 ns | 3.3865 ns | 77.59 ns | 1.00 | 0.00 | 0.0292 | - | 184 B |
PopulateArrayList | Clr | Clr | False | 5 | 108.42 ns | 2.1168 ns | 3.3574 ns | 107.95 ns | 1.40 | 0.07 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 5 | 69.44 ns | 1.3769 ns | 2.8126 ns | 70.29 ns | 1.00 | 0.00 | 0.0292 | - | 184 B |
PopulateArrayList | Core | Core | False | 5 | 96.79 ns | 1.9044 ns | 3.2338 ns | 96.70 ns | 1.40 | 0.07 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 10 | 137.39 ns | 2.7091 ns | 5.2838 ns | 137.05 ns | 1.00 | 0.00 | 0.0532 | - | 336 B |
PopulateArrayList | Clr | Clr | False | 10 | 113.24 ns | 2.1469 ns | 2.6366 ns | 113.39 ns | 0.83 | 0.04 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 10 | 121.26 ns | 2.4120 ns | 4.4709 ns | 120.94 ns | 1.00 | 0.00 | 0.0533 | - | 336 B |
PopulateArrayList | Core | Core | False | 10 | 105.56 ns | 2.0212 ns | 2.7667 ns | 106.13 ns | 0.87 | 0.04 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 15 | 157.02 ns | 3.1135 ns | 5.8479 ns | 157.13 ns | 1.00 | 0.00 | 0.0532 | - | 336 B |
PopulateArrayList | Clr | Clr | False | 15 | 125.79 ns | 2.4945 ns | 5.3160 ns | 125.00 ns | 0.80 | 0.04 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 15 | 129.42 ns | 2.5314 ns | 4.9373 ns | 128.99 ns | 1.00 | 0.00 | 0.0532 | - | 336 B |
PopulateArrayList | Core | Core | False | 15 | 117.39 ns | 2.3384 ns | 5.7800 ns | 117.12 ns | 0.91 | 0.06 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 20 | 234.69 ns | 4.6780 ns | 11.3869 ns | 233.36 ns | 1.00 | 0.00 | 0.0977 | - | 616 B |
PopulateArrayList | Clr | Clr | False | 20 | 243.57 ns | 4.8170 ns | 8.8082 ns | 244.07 ns | 1.04 | 0.06 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 20 | 189.83 ns | 3.7274 ns | 6.5283 ns | 190.19 ns | 1.00 | 0.00 | 0.0977 | - | 616 B |
PopulateArrayList | Core | Core | False | 20 | 233.20 ns | 4.3046 ns | 3.8159 ns | 232.82 ns | 1.23 | 0.05 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 50 | 433.13 ns | 8.6500 ns | 13.4671 ns | 434.84 ns | 1.00 | 0.00 | 0.1826 | - | 1152 B |
PopulateArrayList | Clr | Clr | False | 50 | 439.50 ns | 8.7330 ns | 9.7067 ns | 436.90 ns | 1.02 | 0.04 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 50 | 308.38 ns | 6.1094 ns | 12.8868 ns | 309.76 ns | 1.00 | 0.00 | 0.1826 | - | 1152 B |
PopulateArrayList | Core | Core | False | 50 | 385.38 ns | 7.4876 ns | 10.2491 ns | 388.35 ns | 1.25 | 0.06 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 100 | 773.00 ns | 15.2645 ns | 19.8481 ns | 775.79 ns | 1.00 | 0.00 | 0.3486 | 0.0010 | 2200 B |
PopulateArrayList | Clr | Clr | False | 100 | 744.31 ns | 14.7775 ns | 30.1865 ns | 741.30 ns | 0.96 | 0.05 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 100 | 558.16 ns | 10.9074 ns | 18.5216 ns | 557.59 ns | 1.00 | 0.00 | 0.3486 | 0.0010 | 2200 B |
PopulateArrayList | Core | Core | False | 100 | 625.35 ns | 12.3380 ns | 23.4743 ns | 627.57 ns | 1.12 | 0.06 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 500 | 2,973.45 ns | 63.2658 ns | 109.1304 ns | 2,965.33 ns | 1.00 | 0.00 | 1.3320 | 0.0195 | 8394 B |
PopulateArrayList | Clr | Clr | False | 500 | 2,324.59 ns | 46.3487 ns | 81.1760 ns | 2,345.36 ns | 0.78 | 0.04 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 500 | 2,016.40 ns | 40.1470 ns | 99.9801 ns | 2,021.56 ns | 1.00 | 0.00 | 1.3320 | 0.0195 | 8392 B |
PopulateArrayList | Core | Core | False | 500 | 1,826.25 ns | 36.4874 ns | 80.8536 ns | 1,827.81 ns | 0.91 | 0.06 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | False | 1000 | 5,697.71 ns | 122.1700 ns | 220.2977 ns | 5,719.88 ns | 1.00 | 0.00 | 2.6328 | 0.0781 | 16611 B |
PopulateArrayList | Clr | Clr | False | 1000 | 4,255.77 ns | 84.8432 ns | 175.2159 ns | 4,266.03 ns | 0.75 | 0.04 | - | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | False | 1000 | 3,794.28 ns | 75.4404 ns | 115.2054 ns | 3,808.80 ns | 1.00 | 0.00 | 2.6367 | 0.0781 | 16608 B |
PopulateArrayList | Core | Core | False | 1000 | 3,053.26 ns | 60.0266 ns | 101.9297 ns | 3,064.73 ns | 0.81 | 0.04 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 0 | 15.18 ns | 0.2990 ns | 0.5078 ns | 15.16 ns | 1.00 | 0.00 | 0.0063 | - | 40 B |
PopulateArrayList | Clr | Clr | True | 0 | 20.31 ns | 0.4209 ns | 0.7373 ns | 20.20 ns | 1.34 | 0.07 | 0.0051 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 0 | 13.05 ns | 0.2575 ns | 0.4085 ns | 13.04 ns | 1.00 | 0.00 | 0.0063 | - | 40 B |
PopulateArrayList | Core | Core | True | 0 | 19.90 ns | 0.3955 ns | 0.7619 ns | 19.87 ns | 1.53 | 0.07 | 0.0051 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 5 | 32.60 ns | 0.6469 ns | 1.5624 ns | 32.47 ns | 1.00 | 0.00 | 0.0165 | - | 104 B |
PopulateArrayList | Clr | Clr | True | 5 | 85.18 ns | 1.6938 ns | 3.3828 ns | 84.22 ns | 2.62 | 0.16 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 5 | 22.44 ns | 0.4725 ns | 0.7216 ns | 22.47 ns | 1.00 | 0.00 | 0.0165 | - | 104 B |
PopulateArrayList | Core | Core | True | 5 | 70.97 ns | 1.4177 ns | 3.3966 ns | 70.70 ns | 3.17 | 0.18 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 10 | 64.68 ns | 1.2901 ns | 2.8854 ns | 65.24 ns | 1.00 | 0.00 | 0.0228 | - | 144 B |
PopulateArrayList | Clr | Clr | True | 10 | 100.36 ns | 2.0010 ns | 3.5568 ns | 100.31 ns | 1.55 | 0.09 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 10 | 33.19 ns | 0.6607 ns | 1.6818 ns | 33.63 ns | 1.00 | 0.00 | 0.0228 | - | 144 B |
PopulateArrayList | Core | Core | True | 10 | 79.82 ns | 1.7064 ns | 1.6759 ns | 79.04 ns | 2.41 | 0.13 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 15 | 81.61 ns | 1.6135 ns | 2.9094 ns | 82.87 ns | 1.00 | 0.00 | 0.0292 | - | 184 B |
PopulateArrayList | Clr | Clr | True | 15 | 115.60 ns | 2.2030 ns | 2.1637 ns | 116.51 ns | 1.42 | 0.06 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 15 | 44.70 ns | 0.8868 ns | 1.8314 ns | 44.71 ns | 1.00 | 0.00 | 0.0292 | - | 184 B |
PopulateArrayList | Core | Core | True | 15 | 92.77 ns | 1.8146 ns | 3.3634 ns | 93.75 ns | 2.08 | 0.11 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 20 | 101.95 ns | 2.0082 ns | 3.2429 ns | 102.16 ns | 1.00 | 0.00 | 0.0355 | - | 224 B |
PopulateArrayList | Clr | Clr | True | 20 | 120.46 ns | 2.3821 ns | 3.2606 ns | 121.03 ns | 1.18 | 0.05 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 20 | 55.95 ns | 1.1067 ns | 2.1057 ns | 55.90 ns | 1.00 | 0.00 | 0.0356 | - | 224 B |
PopulateArrayList | Core | Core | True | 20 | 103.42 ns | 2.0113 ns | 3.3604 ns | 103.01 ns | 1.85 | 0.09 | 0.0050 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 50 | 226.26 ns | 4.4596 ns | 8.1546 ns | 227.85 ns | 1.00 | 0.00 | 0.0735 | - | 464 B |
PopulateArrayList | Clr | Clr | True | 50 | 188.48 ns | 3.7679 ns | 6.5992 ns | 187.16 ns | 0.83 | 0.04 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 50 | 134.46 ns | 2.6851 ns | 6.8829 ns | 133.43 ns | 1.00 | 0.00 | 0.0735 | - | 464 B |
PopulateArrayList | Core | Core | True | 50 | 164.99 ns | 3.2741 ns | 5.4703 ns | 167.31 ns | 1.23 | 0.07 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 100 | 431.16 ns | 8.6142 ns | 13.1547 ns | 434.25 ns | 1.00 | 0.00 | 0.1367 | - | 864 B |
PopulateArrayList | Clr | Clr | True | 100 | 294.53 ns | 5.7878 ns | 10.1369 ns | 297.26 ns | 0.68 | 0.03 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 100 | 257.91 ns | 5.1444 ns | 12.0249 ns | 260.03 ns | 1.00 | 0.00 | 0.1367 | - | 864 B |
PopulateArrayList | Core | Core | True | 100 | 273.17 ns | 5.4402 ns | 9.0894 ns | 274.71 ns | 1.06 | 0.06 | 0.0049 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 500 | 2,140.37 ns | 42.6634 ns | 97.1661 ns | 2,141.59 ns | 1.00 | 0.00 | 0.6445 | 0.0078 | 4064 B |
PopulateArrayList | Clr | Clr | True | 500 | 1,131.09 ns | 21.2628 ns | 41.9707 ns | 1,120.32 ns | 0.53 | 0.03 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 500 | 1,259.14 ns | 25.0907 ns | 51.2537 ns | 1,253.71 ns | 1.00 | 0.00 | 0.6445 | 0.0078 | 4064 B |
PopulateArrayList | Core | Core | True | 500 | 1,067.29 ns | 20.9673 ns | 32.0193 ns | 1,060.97 ns | 0.85 | 0.04 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Clr | Clr | True | 1000 | 4,209.81 ns | 81.9353 ns | 141.3344 ns | 4,171.91 ns | 1.00 | 0.00 | 1.2734 | 0.0313 | 8066 B |
PopulateArrayList | Clr | Clr | True | 1000 | 2,373.09 ns | 47.1051 ns | 92.9810 ns | 2,371.61 ns | 0.56 | 0.03 | 0.0039 | - | 32 B |
| | | | | | | | | | | | | |
PopulateList | Core | Core | True | 1000 | 2,440.95 ns | 48.1094 ns | 77.6878 ns | 2,414.70 ns | 1.00 | 0.00 | 1.2773 | 0.0313 | 8064 B |
PopulateArrayList | Core | Core | True | 1000 | 2,318.97 ns | 46.3388 ns | 79.9322 ns | 2,315.37 ns | 0.95 | 0.04 | 0.0039 | - | 32 B |
BTW:
Controversial ultra radical version: make
List<T>
do this,
In this proposal (binary) serialization has to be taken into account. So this would be a radical breaking change.
@gfoidl there would be no field change in that event (it would still be T[] _items
) - I don't think there would actually be any serialization impact there, even for BinaryFormatter
.
@mgravell it is strange that you have such significant difference for Presized
benchmarks.
The difference between PopulateList
and PopulateArrayList
in Presized
scenarios shouldn't be significant and List<T>
should be if not faster but at least not slower than ArrayPoolList<T>
.
I've run them (only for Core using your gist) on my localhost and results confirm my initial assumption.
Can you check your results once again?
Benchmark
BenchmarkDotNet=v0.10.14, OS=Windows 10.0.16299.547 (1709/FallCreatorsUpdate/Redstone3) Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC .NET Core SDK=2.1.302 [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Job=Core Runtime=Core
Method | Presized | FillCount | Mean | Error | StdDev | Scaled | ScaledSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|---|
PopulateList | True | 100 | 180.2 ns | 1.3876 ns | 1.2300 ns | 1.00 | 0.00 | 205.8105 | 843.75 KB |
PopulateArrayList | True | 100 | 244.4 ns | 0.3632 ns | 0.3398 ns | 1.36 | 0.01 | 7.5684 | 31.25 KB |
PopulateList | True | 500 | 880.1 ns | 16.2105 ns | 14.3702 ns | 1.00 | 0.00 | 967.7734 | 3968.75 KB |
PopulateArrayList | True | 500 | 983.4 ns | 3.4452 ns | 3.0541 ns | 1.12 | 0.02 | 5.8594 | 31.25 KB |
PopulateList | True | 1000 | 1,710.9 ns | 3.4644 ns | 3.2406 ns | 1.00 | 0.00 | 1917.9688 | 7875 KB |
PopulateArrayList | True | 1000 | 1,893.5 ns | 5.6394 ns | 5.2751 ns | 1.11 | 0.00 | 5.8594 | 31.25 KB |
@OlegKarasik
The difference between
PopulateList
andPopulateArrayList
inPresized
scenarios shouldn't be significant andList<T>
should be if not faster but at least not slower thanArrayPoolList<T>
.
For large arrays, it could be significant if it is causing it to GC, but: I can re-run later
@OlegKarasik here's another run, just 100/500/1000 on CoreCLR this time
(note: edited due to runtime snafu)
BenchmarkDotNet=v0.11.0, OS=Windows 10.0.17134.165 (1803/April2018Update/Redstone4) Intel Core i7-5930K CPU 3.50GHz (Broadwell), 1 CPU, 12 logical and 6 physical cores Frequency=3416974 Hz, Resolution=292.6566 ns, Timer=TSC .NET Core SDK=2.1.300 [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Job=Core Runtime=Core
Method | Presized | FillCount | Mean | Error | StdDev | Scaled | ScaledSD | Gen 0 | Gen 1 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|
PopulateList | False | 100 | 512.5 ns | 5.173 ns | 4.586 ns | 1.00 | 0.00 | 0.3486 | 0.0010 | 2200 B |
PopulateArrayList | False | 100 | 639.2 ns | 12.434 ns | 13.821 ns | 1.25 | 0.03 | 0.0039 | - | 32 B |
PopulateList | False | 500 | 1,916.4 ns | 38.179 ns | 37.497 ns | 1.00 | 0.00 | 1.3320 | 0.0215 | 8392 B |
PopulateArrayList | False | 500 | 1,775.8 ns | 34.282 ns | 32.067 ns | 0.93 | 0.02 | 0.0039 | - | 32 B |
PopulateList | False | 1000 | 3,831.2 ns | 76.621 ns | 136.193 ns | 1.00 | 0.00 | 2.6367 | 0.0781 | 16608 B |
PopulateArrayList | False | 1000 | 2,955.5 ns | 28.408 ns | 22.179 ns | 0.77 | 0.03 | 0.0039 | - | 32 B |
PopulateList | True | 100 | 249.9 ns | 4.984 ns | 5.540 ns | 1.00 | 0.00 | 0.1367 | - | 864 B |
PopulateArrayList | True | 100 | 275.8 ns | 5.756 ns | 7.279 ns | 1.10 | 0.04 | 0.0049 | - | 32 B |
PopulateList | True | 500 | 1,182.4 ns | 9.496 ns | 8.883 ns | 1.00 | 0.00 | 0.6445 | 0.0078 | 4064 B |
PopulateArrayList | True | 500 | 1,021.4 ns | 20.329 ns | 19.965 ns | 0.86 | 0.02 | 0.0039 | - | 32 B |
PopulateList | True | 1000 | 2,457.3 ns | 48.330 ns | 70.842 ns | 1.00 | 0.00 | 1.2773 | 0.0313 | 8064 B |
PopulateArrayList | True | 1000 | 1,954.1 ns | 10.112 ns | 8.964 ns | 0.80 | 0.02 | 0.0039 | - | 32 B |
@mgravell
This is very strange. Initially when I saw you second run I thought that the difference can be caused by different versions of BenchmarkDotNet and different measurements mechanics (in the first run I used 0.10.14) but after upgrade I am still getting really strange results - List<T>
is faster in all scenarios (even when Presized=false
).
There are also a few interesting points:
PopulateArrayList
are comparable despite we have different hardware.List<T>
isn't Presized
.Benchmarks (100/500/1000 on CoreCLR with Presized: true, false)
BenchmarkDotNet=v0.11.0, OS=Windows 10.0.16299.547 (1709/FallCreatorsUpdate/Redstone3)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC
.NET Core SDK=2.1.302
[Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Job=Core Runtime=Core
Method | Presized | FillCount | Mean | Error | StdDev | Scaled | ScaledSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|---|
PopulateList | False | 100 | 379.7 ns | 1.9369 ns | 1.8118 ns | 1.00 | 0.00 | 0.5244 | 2200 B |
PopulateArrayList | False | 100 | 503.6 ns | 1.7554 ns | 1.6420 ns | 1.33 | 0.01 | 0.0068 | 32 B |
PopulateList | False | 500 | 1,355.6 ns | 39.2060 ns | 36.6734 ns | 1.00 | 0.00 | 1.9980 | 8392 B |
PopulateArrayList | False | 500 | 1,549.4 ns | 5.0870 ns | 4.7584 ns | 1.14 | 0.03 | 0.0059 | 32 B |
PopulateList | False | 1000 | 2,556.7 ns | 17.5119 ns | 14.6233 ns | 1.00 | 0.00 | 3.9492 | 16608 B |
PopulateArrayList | False | 1000 | 2,813.7 ns | 52.7974 ns | 49.3867 ns | 1.10 | 0.02 | 0.0039 | 32 B |
PopulateList | True | 100 | 181.2 ns | 0.5938 ns | 0.5555 ns | 1.00 | 0.00 | 0.2058 | 864 B |
PopulateArrayList | True | 100 | 263.5 ns | 0.6528 ns | 0.5787 ns | 1.45 | 0.01 | 0.0073 | 32 B |
PopulateList | True | 500 | 875.2 ns | 3.0362 ns | 2.6915 ns | 1.00 | 0.00 | 0.9678 | 4064 B |
PopulateArrayList | True | 500 | 1,104.5 ns | 3.2272 ns | 2.8608 ns | 1.26 | 0.00 | 0.0059 | 32 B |
PopulateList | True | 1000 | 1,712.4 ns | 10.2909 ns | 9.6261 ns | 1.00 | 0.00 | 1.9180 | 8064 B |
PopulateArrayList | True | 1000 | 1,984.3 ns | 3.5040 ns | 2.9260 ns | 1.16 | 0.01 | 0.0039 | 32 B |
I agree that is unexpected. All I know is: what I get is what I get, and what you get is what you get.
Probably from different amounts of RAM in the two machines; raised issue to add it to output https://github.com/dotnet/BenchmarkDotNet/issues/846
The additional costs from the ArrayPoolList are from renting and returning the arrays; and in the List from GC; otherwise the results should be mostly comparable.
quite possibly; I've got 64GiB here (I really should see about that 128GiB upgrade...), but SQL Server and about a dozen redis servers in WSL will have been chewing up a lot of that. Plus... you know, two instances of devenv
which are probably 10GiB each - and Chrome, so there goes another 20GiB... :)
@mgravell @benaadams I have 16 gigs of RAM and only ~5 gigs available.
@benaadams
The additional costs from the ArrayPoolList are from renting and returning the arrays
This can be the reason. I have checked the source of Rent and Return and looks like they do lot of work with Thread Local Storage and maintains storage on per core basis.
Do you know whether BenchmarkDotNet maintains Thread Affinity during the benchmarks? Can Thread migration be the reason?
@OlegKarasik renting from the arraypool isn't faster than allocating; it reduces the pressure on the GC and has more predicable sustained performance than GCs and reduces the GC effects on the rest of the application that is allocating. Also adding a disposal check per Add
is going to be slower in that function than List which doesn't have one.
@mgravell Feedback, AsSpan
should probably be in MemoryMarshal rather than directly on the class as it is unsafe?
public class MemoryMarshal
{
public Span<T> AsSpan<T>(this ArrayPoolList<T> list) {...}
}
@benaadams I just never thought that the speed difference is significant. Thanks for pointing this out.
@danmosemsft would it fit into your "PowerCollections" plan?
@karelz Do you think there is a possibility for this feature request to get a higher priority?
I do think something's got to happen sooner or later RE: collections pooling if not as part of the FW then 3-rd party. There is (at least) one community - based project already: https://github.com/jtmueller/Collections.Pooled that looks good but haven't seen any recent updates to its 2.0 "preview" release. Going to try pinging them there.
@GrabYourPitchforks any thoughts about this idea?
I see this as akin to the ValueStringBuilder
feature request. We're missing semantics in the language that would allow these sorts of constructs to be safe by design. There's some discussion in that linked issue about what those semantics might look like. Once that's addressed I imagine a natural design for an ArrayPoolList<T>
feature would fall out naturally.
It doesn't look there's been any mention of this in this thread, but there is a library out there Collections.Pooled that looks to support a lot of scenarios related to this and posts some very impressive benchmarks. Might be worth looking into?
We've started leveraging this library on our large calculation engine that ends up throwing a good amount on the LOH (arrays w/ multi-million counts of reference types)
I'm 0% associated with the library in question, but thought I would throw it out there!
@cmeyertons Collections.Pooled has been mentioned in this comment: https://github.com/dotnet/runtime/issues/27023#issuecomment-567935382
The project seemed quiet when I last looked at it. Issues are opened and then closed without being answered.
List<T>
clears array elements that become unused, to prevent references in them from keeping objects alive longer than needed: https://github.com/dotnet/runtime/blob/1f4d0db2339c37d75723d063827fd2a4c6e2ecef/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs#L880-L883
Should ArrayPoolList<T>.Dispose()
likewise clear the array that it returns to the pool, if the elements contain references?
ArrayPoolList<T>.Dispose()
itself clears the array, but the pool is already full, then the effort is wasted.ArrayPoolList<T>.Dispose()
calls ArrayPool<T>.Return(T[] array, bool clearArray)
with clearArray: true
, but the list has used only half of the capacity, then some effort is wasted.ArrayPool<T>.Return(T[] array, int numberOfElementsToClear)
could speed this up, but it would make the API more complex, especially for developers who implement their own classes derived from ArrayPool<T>
and have to override these methods.Perhaps it is best to use clearArray: RuntimeHelpers.IsReferenceOrContainsReferences<T>()
initially and then file a separate proposal for ArrayPool<T>.Return(T[] array, int numberOfElementsToClear)
.
Area:
System.Buffers
Currently, POCOs/DTOs often default to
List<T>
; a good solid implementation, but not very pool friendly, despite the fact POCOs etc often have very predictable life cycles. Additionally, when lists need to resize (growth), arrays are currently just dropped on the floor.To address these issues, and to make for effective memory usage for many scenarios, consider:
behaviour:
List<T>
Dispose()
returns array to poolObjectDisposedException
It would presumably need a custom
GetEnumerator()
, which for best performance should talk directly to the array; so to reduce the risk of this accidentally causing access to a returned array, it might be considered to be aref struct
enumerator, such that it cannot be stored anywhere. TheIEnumerable<T>
enumerator (explicit interface implementation) would presumably go via the indexer (so: localforeach
is fast;IEnumerable<T>
is safe but slower).Open questions:
~should it actually be a method from~ (edit: this was a silly idea; the customArrayPool<T>
, with the type itself hidden? i.e onArrayPool<T>
:public IList<T> CreateList(int initialCapacity = 0)
GetEnumerator()
won't work if we don't have apublic
type)Order : IDisposable
, there's a simple operation to do that, presumably after nulling array the field in the list but before returning it to the pool, to prevent recursion problems; perhaps an extension method on<T>(this ArrayPoolList<T> list) where T : class, IDisposable
that calls back into aninternal
method on the list?)Example usage:
Controversial ultra radical version: make
List<T>
do this, or add a newList<T>
subclass that does this. Disadvantage: can't deploy as easily to older frameworks like net4*