dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Proposal: pool-based list, similar to List<T>, but based on ArrayPool<T> #27023

Open mgravell opened 6 years ago

mgravell commented 6 years ago

Area: System.Buffers

Currently, POCOs/DTOs often default to List<T>; a good solid implementation, but not very pool friendly, despite the fact POCOs etc often have very predictable life cycles. Additionally, when lists need to resize (growth), arrays are currently just dropped on the floor.

To address these issues, and to make for effective memory usage for many scenarios, consider:

public class ArrayPoolList<T> : IList<T>, IDisposable {
    ... 
    public ref struct Enumerator {...}
    public Enumerator GetEnumerator() {...}

    public Span<T> AsSpan() {...} // for current size; undefined behaviour if they Remove() after
    // (or a better name that conveys "at own risk"?)
    // note: no Memory<T>, due to risk of storage and accessing recycled memory
}

behaviour:

It would presumably need a custom GetEnumerator(), which for best performance should talk directly to the array; so to reduce the risk of this accidentally causing access to a returned array, it might be considered to be a ref struct enumerator, such that it cannot be stored anywhere. The IEnumerable<T> enumerator (explicit interface implementation) would presumably go via the indexer (so: local foreach is fast; IEnumerable<T> is safe but slower).

Open questions:

Example usage:

class Customer : IDisposable {
    public int Id {get;set;}
    public string Name {get;set;}
    public IList<Order> Orders {get;} = new ArrayPoolList<Order>();
    public void Dispose() => Orders.Dispose();
}

Controversial ultra radical version: make List<T> do this, or add a new List<T> subclass that does this. Disadvantage: can't deploy as easily to older frameworks like net4*

mgravell commented 6 years ago

I figured the best thing to do would be to spike it and measure the impact; so I whipped something together based on List<T> and ran it through BenchmarkDotNet; testing Add in particular (respecting all the same features like version-tracking for enumerator breaking), and making sure to check for ObjectDisposedException (rather than failing in more obscure ways) - over a range of final element sizes, for both "presized" (given the right length as a hint) and allowed to size itself automatically (no size hint).

Here's the gist, and here's the results (wow this took a long time to run!).

Conclusion: the only time it is even in doubt is for very small lists - pretty sure this could be optimized a little, too.

            Method |  Job | Runtime | Presized | FillCount |        Mean |       Error |      StdDev |      Median | Scaled | ScaledSD |  Gen 0 |  Gen 1 | Allocated |
------------------ |----- |-------- |--------- |---------- |------------:|------------:|------------:|------------:|-------:|---------:|-------:|-------:|----------:|
      PopulateList |  Clr |     Clr |    False |         0 |    11.49 ns |   0.2290 ns |   0.4520 ns |    11.42 ns |   1.00 |     0.00 | 0.0063 |      - |      40 B |
 PopulateArrayList |  Clr |     Clr |    False |         0 |    20.24 ns |   0.4019 ns |   0.5634 ns |    20.18 ns |   1.76 |     0.08 | 0.0051 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |         0 |    10.27 ns |   0.2044 ns |   0.3301 ns |    10.23 ns |   1.00 |     0.00 | 0.0063 |      - |      40 B |
 PopulateArrayList | Core |    Core |    False |         0 |    20.14 ns |   0.3893 ns |   0.7019 ns |    20.24 ns |   1.96 |     0.09 | 0.0051 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |         5 |    77.81 ns |   1.5428 ns |   3.3865 ns |    77.59 ns |   1.00 |     0.00 | 0.0292 |      - |     184 B |
 PopulateArrayList |  Clr |     Clr |    False |         5 |   108.42 ns |   2.1168 ns |   3.3574 ns |   107.95 ns |   1.40 |     0.07 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |         5 |    69.44 ns |   1.3769 ns |   2.8126 ns |    70.29 ns |   1.00 |     0.00 | 0.0292 |      - |     184 B |
 PopulateArrayList | Core |    Core |    False |         5 |    96.79 ns |   1.9044 ns |   3.2338 ns |    96.70 ns |   1.40 |     0.07 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |        10 |   137.39 ns |   2.7091 ns |   5.2838 ns |   137.05 ns |   1.00 |     0.00 | 0.0532 |      - |     336 B |
 PopulateArrayList |  Clr |     Clr |    False |        10 |   113.24 ns |   2.1469 ns |   2.6366 ns |   113.39 ns |   0.83 |     0.04 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |        10 |   121.26 ns |   2.4120 ns |   4.4709 ns |   120.94 ns |   1.00 |     0.00 | 0.0533 |      - |     336 B |
 PopulateArrayList | Core |    Core |    False |        10 |   105.56 ns |   2.0212 ns |   2.7667 ns |   106.13 ns |   0.87 |     0.04 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |        15 |   157.02 ns |   3.1135 ns |   5.8479 ns |   157.13 ns |   1.00 |     0.00 | 0.0532 |      - |     336 B |
 PopulateArrayList |  Clr |     Clr |    False |        15 |   125.79 ns |   2.4945 ns |   5.3160 ns |   125.00 ns |   0.80 |     0.04 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |        15 |   129.42 ns |   2.5314 ns |   4.9373 ns |   128.99 ns |   1.00 |     0.00 | 0.0532 |      - |     336 B |
 PopulateArrayList | Core |    Core |    False |        15 |   117.39 ns |   2.3384 ns |   5.7800 ns |   117.12 ns |   0.91 |     0.06 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |        20 |   234.69 ns |   4.6780 ns |  11.3869 ns |   233.36 ns |   1.00 |     0.00 | 0.0977 |      - |     616 B |
 PopulateArrayList |  Clr |     Clr |    False |        20 |   243.57 ns |   4.8170 ns |   8.8082 ns |   244.07 ns |   1.04 |     0.06 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |        20 |   189.83 ns |   3.7274 ns |   6.5283 ns |   190.19 ns |   1.00 |     0.00 | 0.0977 |      - |     616 B |
 PopulateArrayList | Core |    Core |    False |        20 |   233.20 ns |   4.3046 ns |   3.8159 ns |   232.82 ns |   1.23 |     0.05 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |        50 |   433.13 ns |   8.6500 ns |  13.4671 ns |   434.84 ns |   1.00 |     0.00 | 0.1826 |      - |    1152 B |
 PopulateArrayList |  Clr |     Clr |    False |        50 |   439.50 ns |   8.7330 ns |   9.7067 ns |   436.90 ns |   1.02 |     0.04 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |        50 |   308.38 ns |   6.1094 ns |  12.8868 ns |   309.76 ns |   1.00 |     0.00 | 0.1826 |      - |    1152 B |
 PopulateArrayList | Core |    Core |    False |        50 |   385.38 ns |   7.4876 ns |  10.2491 ns |   388.35 ns |   1.25 |     0.06 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |       100 |   773.00 ns |  15.2645 ns |  19.8481 ns |   775.79 ns |   1.00 |     0.00 | 0.3486 | 0.0010 |    2200 B |
 PopulateArrayList |  Clr |     Clr |    False |       100 |   744.31 ns |  14.7775 ns |  30.1865 ns |   741.30 ns |   0.96 |     0.05 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |       100 |   558.16 ns |  10.9074 ns |  18.5216 ns |   557.59 ns |   1.00 |     0.00 | 0.3486 | 0.0010 |    2200 B |
 PopulateArrayList | Core |    Core |    False |       100 |   625.35 ns |  12.3380 ns |  23.4743 ns |   627.57 ns |   1.12 |     0.06 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |       500 | 2,973.45 ns |  63.2658 ns | 109.1304 ns | 2,965.33 ns |   1.00 |     0.00 | 1.3320 | 0.0195 |    8394 B |
 PopulateArrayList |  Clr |     Clr |    False |       500 | 2,324.59 ns |  46.3487 ns |  81.1760 ns | 2,345.36 ns |   0.78 |     0.04 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |       500 | 2,016.40 ns |  40.1470 ns |  99.9801 ns | 2,021.56 ns |   1.00 |     0.00 | 1.3320 | 0.0195 |    8392 B |
 PopulateArrayList | Core |    Core |    False |       500 | 1,826.25 ns |  36.4874 ns |  80.8536 ns | 1,827.81 ns |   0.91 |     0.06 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |    False |      1000 | 5,697.71 ns | 122.1700 ns | 220.2977 ns | 5,719.88 ns |   1.00 |     0.00 | 2.6328 | 0.0781 |   16611 B |
 PopulateArrayList |  Clr |     Clr |    False |      1000 | 4,255.77 ns |  84.8432 ns | 175.2159 ns | 4,266.03 ns |   0.75 |     0.04 |      - |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |    False |      1000 | 3,794.28 ns |  75.4404 ns | 115.2054 ns | 3,808.80 ns |   1.00 |     0.00 | 2.6367 | 0.0781 |   16608 B |
 PopulateArrayList | Core |    Core |    False |      1000 | 3,053.26 ns |  60.0266 ns | 101.9297 ns | 3,064.73 ns |   0.81 |     0.04 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |         0 |    15.18 ns |   0.2990 ns |   0.5078 ns |    15.16 ns |   1.00 |     0.00 | 0.0063 |      - |      40 B |
 PopulateArrayList |  Clr |     Clr |     True |         0 |    20.31 ns |   0.4209 ns |   0.7373 ns |    20.20 ns |   1.34 |     0.07 | 0.0051 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |         0 |    13.05 ns |   0.2575 ns |   0.4085 ns |    13.04 ns |   1.00 |     0.00 | 0.0063 |      - |      40 B |
 PopulateArrayList | Core |    Core |     True |         0 |    19.90 ns |   0.3955 ns |   0.7619 ns |    19.87 ns |   1.53 |     0.07 | 0.0051 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |         5 |    32.60 ns |   0.6469 ns |   1.5624 ns |    32.47 ns |   1.00 |     0.00 | 0.0165 |      - |     104 B |
 PopulateArrayList |  Clr |     Clr |     True |         5 |    85.18 ns |   1.6938 ns |   3.3828 ns |    84.22 ns |   2.62 |     0.16 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |         5 |    22.44 ns |   0.4725 ns |   0.7216 ns |    22.47 ns |   1.00 |     0.00 | 0.0165 |      - |     104 B |
 PopulateArrayList | Core |    Core |     True |         5 |    70.97 ns |   1.4177 ns |   3.3966 ns |    70.70 ns |   3.17 |     0.18 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |        10 |    64.68 ns |   1.2901 ns |   2.8854 ns |    65.24 ns |   1.00 |     0.00 | 0.0228 |      - |     144 B |
 PopulateArrayList |  Clr |     Clr |     True |        10 |   100.36 ns |   2.0010 ns |   3.5568 ns |   100.31 ns |   1.55 |     0.09 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |        10 |    33.19 ns |   0.6607 ns |   1.6818 ns |    33.63 ns |   1.00 |     0.00 | 0.0228 |      - |     144 B |
 PopulateArrayList | Core |    Core |     True |        10 |    79.82 ns |   1.7064 ns |   1.6759 ns |    79.04 ns |   2.41 |     0.13 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |        15 |    81.61 ns |   1.6135 ns |   2.9094 ns |    82.87 ns |   1.00 |     0.00 | 0.0292 |      - |     184 B |
 PopulateArrayList |  Clr |     Clr |     True |        15 |   115.60 ns |   2.2030 ns |   2.1637 ns |   116.51 ns |   1.42 |     0.06 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |        15 |    44.70 ns |   0.8868 ns |   1.8314 ns |    44.71 ns |   1.00 |     0.00 | 0.0292 |      - |     184 B |
 PopulateArrayList | Core |    Core |     True |        15 |    92.77 ns |   1.8146 ns |   3.3634 ns |    93.75 ns |   2.08 |     0.11 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |        20 |   101.95 ns |   2.0082 ns |   3.2429 ns |   102.16 ns |   1.00 |     0.00 | 0.0355 |      - |     224 B |
 PopulateArrayList |  Clr |     Clr |     True |        20 |   120.46 ns |   2.3821 ns |   3.2606 ns |   121.03 ns |   1.18 |     0.05 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |        20 |    55.95 ns |   1.1067 ns |   2.1057 ns |    55.90 ns |   1.00 |     0.00 | 0.0356 |      - |     224 B |
 PopulateArrayList | Core |    Core |     True |        20 |   103.42 ns |   2.0113 ns |   3.3604 ns |   103.01 ns |   1.85 |     0.09 | 0.0050 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |        50 |   226.26 ns |   4.4596 ns |   8.1546 ns |   227.85 ns |   1.00 |     0.00 | 0.0735 |      - |     464 B |
 PopulateArrayList |  Clr |     Clr |     True |        50 |   188.48 ns |   3.7679 ns |   6.5992 ns |   187.16 ns |   0.83 |     0.04 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |        50 |   134.46 ns |   2.6851 ns |   6.8829 ns |   133.43 ns |   1.00 |     0.00 | 0.0735 |      - |     464 B |
 PopulateArrayList | Core |    Core |     True |        50 |   164.99 ns |   3.2741 ns |   5.4703 ns |   167.31 ns |   1.23 |     0.07 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |       100 |   431.16 ns |   8.6142 ns |  13.1547 ns |   434.25 ns |   1.00 |     0.00 | 0.1367 |      - |     864 B |
 PopulateArrayList |  Clr |     Clr |     True |       100 |   294.53 ns |   5.7878 ns |  10.1369 ns |   297.26 ns |   0.68 |     0.03 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |       100 |   257.91 ns |   5.1444 ns |  12.0249 ns |   260.03 ns |   1.00 |     0.00 | 0.1367 |      - |     864 B |
 PopulateArrayList | Core |    Core |     True |       100 |   273.17 ns |   5.4402 ns |   9.0894 ns |   274.71 ns |   1.06 |     0.06 | 0.0049 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |       500 | 2,140.37 ns |  42.6634 ns |  97.1661 ns | 2,141.59 ns |   1.00 |     0.00 | 0.6445 | 0.0078 |    4064 B |
 PopulateArrayList |  Clr |     Clr |     True |       500 | 1,131.09 ns |  21.2628 ns |  41.9707 ns | 1,120.32 ns |   0.53 |     0.03 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |       500 | 1,259.14 ns |  25.0907 ns |  51.2537 ns | 1,253.71 ns |   1.00 |     0.00 | 0.6445 | 0.0078 |    4064 B |
 PopulateArrayList | Core |    Core |     True |       500 | 1,067.29 ns |  20.9673 ns |  32.0193 ns | 1,060.97 ns |   0.85 |     0.04 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList |  Clr |     Clr |     True |      1000 | 4,209.81 ns |  81.9353 ns | 141.3344 ns | 4,171.91 ns |   1.00 |     0.00 | 1.2734 | 0.0313 |    8066 B |
 PopulateArrayList |  Clr |     Clr |     True |      1000 | 2,373.09 ns |  47.1051 ns |  92.9810 ns | 2,371.61 ns |   0.56 |     0.03 | 0.0039 |      - |      32 B |
                   |      |         |          |           |             |             |             |             |        |          |        |        |           |
      PopulateList | Core |    Core |     True |      1000 | 2,440.95 ns |  48.1094 ns |  77.6878 ns | 2,414.70 ns |   1.00 |     0.00 | 1.2773 | 0.0313 |    8064 B |
 PopulateArrayList | Core |    Core |     True |      1000 | 2,318.97 ns |  46.3388 ns |  79.9322 ns | 2,315.37 ns |   0.95 |     0.04 | 0.0039 |      - |      32 B |
gfoidl commented 6 years ago

BTW:

Controversial ultra radical version: make List<T> do this,

In this proposal (binary) serialization has to be taken into account. So this would be a radical breaking change.

mgravell commented 6 years ago

@gfoidl there would be no field change in that event (it would still be T[] _items) - I don't think there would actually be any serialization impact there, even for BinaryFormatter.

OlegKarasik commented 6 years ago

@mgravell it is strange that you have such significant difference for Presized benchmarks.

The difference between PopulateList and PopulateArrayList in Presized scenarios shouldn't be significant and List<T> should be if not faster but at least not slower than ArrayPoolList<T>.

I've run them (only for Core using your gist) on my localhost and results confirm my initial assumption.

Can you check your results once again?

Benchmark

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.16299.547 (1709/FallCreatorsUpdate/Redstone3) Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC .NET Core SDK=2.1.302 [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT

Job=Core Runtime=Core

Method Presized FillCount Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
PopulateList True 100 180.2 ns 1.3876 ns 1.2300 ns 1.00 0.00 205.8105 843.75 KB
PopulateArrayList True 100 244.4 ns 0.3632 ns 0.3398 ns 1.36 0.01 7.5684 31.25 KB
PopulateList True 500 880.1 ns 16.2105 ns 14.3702 ns 1.00 0.00 967.7734 3968.75 KB
PopulateArrayList True 500 983.4 ns 3.4452 ns 3.0541 ns 1.12 0.02 5.8594 31.25 KB
PopulateList True 1000 1,710.9 ns 3.4644 ns 3.2406 ns 1.00 0.00 1917.9688 7875 KB
PopulateArrayList True 1000 1,893.5 ns 5.6394 ns 5.2751 ns 1.11 0.00 5.8594 31.25 KB
mgravell commented 6 years ago

@OlegKarasik

The difference between PopulateList and PopulateArrayList in Presized scenarios shouldn't be significant and List<T> should be if not faster but at least not slower than ArrayPoolList<T>.

For large arrays, it could be significant if it is causing it to GC, but: I can re-run later

mgravell commented 6 years ago

@OlegKarasik here's another run, just 100/500/1000 on CoreCLR this time

(note: edited due to runtime snafu)

BenchmarkDotNet=v0.11.0, OS=Windows 10.0.17134.165 (1803/April2018Update/Redstone4) Intel Core i7-5930K CPU 3.50GHz (Broadwell), 1 CPU, 12 logical and 6 physical cores Frequency=3416974 Hz, Resolution=292.6566 ns, Timer=TSC .NET Core SDK=2.1.300 [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT Core : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT

Job=Core Runtime=Core

Method Presized FillCount Mean Error StdDev Scaled ScaledSD Gen 0 Gen 1 Allocated
PopulateList False 100 512.5 ns 5.173 ns 4.586 ns 1.00 0.00 0.3486 0.0010 2200 B
PopulateArrayList False 100 639.2 ns 12.434 ns 13.821 ns 1.25 0.03 0.0039 - 32 B
PopulateList False 500 1,916.4 ns 38.179 ns 37.497 ns 1.00 0.00 1.3320 0.0215 8392 B
PopulateArrayList False 500 1,775.8 ns 34.282 ns 32.067 ns 0.93 0.02 0.0039 - 32 B
PopulateList False 1000 3,831.2 ns 76.621 ns 136.193 ns 1.00 0.00 2.6367 0.0781 16608 B
PopulateArrayList False 1000 2,955.5 ns 28.408 ns 22.179 ns 0.77 0.03 0.0039 - 32 B
PopulateList True 100 249.9 ns 4.984 ns 5.540 ns 1.00 0.00 0.1367 - 864 B
PopulateArrayList True 100 275.8 ns 5.756 ns 7.279 ns 1.10 0.04 0.0049 - 32 B
PopulateList True 500 1,182.4 ns 9.496 ns 8.883 ns 1.00 0.00 0.6445 0.0078 4064 B
PopulateArrayList True 500 1,021.4 ns 20.329 ns 19.965 ns 0.86 0.02 0.0039 - 32 B
PopulateList True 1000 2,457.3 ns 48.330 ns 70.842 ns 1.00 0.00 1.2773 0.0313 8064 B
PopulateArrayList True 1000 1,954.1 ns 10.112 ns 8.964 ns 0.80 0.02 0.0039 - 32 B
OlegKarasik commented 6 years ago

@mgravell

This is very strange. Initially when I saw you second run I thought that the difference can be caused by different versions of BenchmarkDotNet and different measurements mechanics (in the first run I used 0.10.14) but after upgrade I am still getting really strange results - List<T> is faster in all scenarios (even when Presized=false).

There are also a few interesting points:

Benchmarks (100/500/1000 on CoreCLR with Presized: true, false)


BenchmarkDotNet=v0.11.0, OS=Windows 10.0.16299.547 (1709/FallCreatorsUpdate/Redstone3)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC
.NET Core SDK=2.1.302
  [Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
  Core   : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT

Job=Core  Runtime=Core  
Method Presized FillCount Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
PopulateList False 100 379.7 ns 1.9369 ns 1.8118 ns 1.00 0.00 0.5244 2200 B
PopulateArrayList False 100 503.6 ns 1.7554 ns 1.6420 ns 1.33 0.01 0.0068 32 B
PopulateList False 500 1,355.6 ns 39.2060 ns 36.6734 ns 1.00 0.00 1.9980 8392 B
PopulateArrayList False 500 1,549.4 ns 5.0870 ns 4.7584 ns 1.14 0.03 0.0059 32 B
PopulateList False 1000 2,556.7 ns 17.5119 ns 14.6233 ns 1.00 0.00 3.9492 16608 B
PopulateArrayList False 1000 2,813.7 ns 52.7974 ns 49.3867 ns 1.10 0.02 0.0039 32 B
PopulateList True 100 181.2 ns 0.5938 ns 0.5555 ns 1.00 0.00 0.2058 864 B
PopulateArrayList True 100 263.5 ns 0.6528 ns 0.5787 ns 1.45 0.01 0.0073 32 B
PopulateList True 500 875.2 ns 3.0362 ns 2.6915 ns 1.00 0.00 0.9678 4064 B
PopulateArrayList True 500 1,104.5 ns 3.2272 ns 2.8608 ns 1.26 0.00 0.0059 32 B
PopulateList True 1000 1,712.4 ns 10.2909 ns 9.6261 ns 1.00 0.00 1.9180 8064 B
PopulateArrayList True 1000 1,984.3 ns 3.5040 ns 2.9260 ns 1.16 0.01 0.0039 32 B
mgravell commented 6 years ago

I agree that is unexpected. All I know is: what I get is what I get, and what you get is what you get.

benaadams commented 6 years ago

Probably from different amounts of RAM in the two machines; raised issue to add it to output https://github.com/dotnet/BenchmarkDotNet/issues/846

The additional costs from the ArrayPoolList are from renting and returning the arrays; and in the List from GC; otherwise the results should be mostly comparable.

mgravell commented 6 years ago

quite possibly; I've got 64GiB here (I really should see about that 128GiB upgrade...), but SQL Server and about a dozen redis servers in WSL will have been chewing up a lot of that. Plus... you know, two instances of devenv which are probably 10GiB each - and Chrome, so there goes another 20GiB... :)

OlegKarasik commented 6 years ago

@mgravell @benaadams I have 16 gigs of RAM and only ~5 gigs available.

@benaadams

The additional costs from the ArrayPoolList are from renting and returning the arrays

This can be the reason. I have checked the source of Rent and Return and looks like they do lot of work with Thread Local Storage and maintains storage on per core basis.

Do you know whether BenchmarkDotNet maintains Thread Affinity during the benchmarks? Can Thread migration be the reason?

benaadams commented 6 years ago

@OlegKarasik renting from the arraypool isn't faster than allocating; it reduces the pressure on the GC and has more predicable sustained performance than GCs and reduces the GC effects on the rest of the application that is allocating. Also adding a disposal check per Add is going to be slower in that function than List which doesn't have one.

@mgravell Feedback, AsSpan should probably be in MemoryMarshal rather than directly on the class as it is unsafe?

public class MemoryMarshal
{
    public Span<T> AsSpan<T>(this ArrayPoolList<T> list) {...}
}
OlegKarasik commented 6 years ago

@benaadams I just never thought that the speed difference is significant. Thanks for pointing this out.

karelz commented 6 years ago

@danmosemsft would it fit into your "PowerCollections" plan?

baal2000 commented 4 years ago

@karelz Do you think there is a possibility for this feature request to get a higher priority?

I do think something's got to happen sooner or later RE: collections pooling if not as part of the FW then 3-rd party. There is (at least) one community - based project already: https://github.com/jtmueller/Collections.Pooled that looks good but haven't seen any recent updates to its 2.0 "preview" release. Going to try pinging them there.

danmoseley commented 4 years ago

@GrabYourPitchforks any thoughts about this idea?

GrabYourPitchforks commented 4 years ago

I see this as akin to the ValueStringBuilder feature request. We're missing semantics in the language that would allow these sorts of constructs to be safe by design. There's some discussion in that linked issue about what those semantics might look like. Once that's addressed I imagine a natural design for an ArrayPoolList<T> feature would fall out naturally.

cmeyertons commented 4 years ago

It doesn't look there's been any mention of this in this thread, but there is a library out there Collections.Pooled that looks to support a lot of scenarios related to this and posts some very impressive benchmarks. Might be worth looking into?

We've started leveraging this library on our large calculation engine that ends up throwing a good amount on the LOH (arrays w/ multi-million counts of reference types)

I'm 0% associated with the library in question, but thought I would throw it out there!

baal2000 commented 4 years ago

@cmeyertons Collections.Pooled has been mentioned in this comment: https://github.com/dotnet/runtime/issues/27023#issuecomment-567935382

The project seemed quiet when I last looked at it. Issues are opened and then closed without being answered.

KalleOlaviNiemitalo commented 4 years ago

List<T> clears array elements that become unused, to prevent references in them from keeping objects alive longer than needed: https://github.com/dotnet/runtime/blob/1f4d0db2339c37d75723d063827fd2a4c6e2ecef/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/List.cs#L880-L883

Should ArrayPoolList<T>.Dispose() likewise clear the array that it returns to the pool, if the elements contain references?

Perhaps it is best to use clearArray: RuntimeHelpers.IsReferenceOrContainsReferences<T>() initially and then file a separate proposal for ArrayPool<T>.Return(T[] array, int numberOfElementsToClear).