jtmueller / Collections.Pooled

Fast, low-allocation ports of List, Dictionary, HashSet, Stack, and Queue using ArrayPool and Span.
MIT License
547 stars 48 forks source link

PooledList: the indexer-get operation is 9-10 times slower than that of List, but Span is fine #8

Closed dzmitry-lahoda closed 5 years ago

dzmitry-lahoda commented 5 years ago

PooledList: the indexer-get operation is 9-10 times slower than that of List, despite being the exact same code, as far as I can tell! I haven't found any way of closing that gap other than removing the bounds-check, which isn't a good idea. I suspect some sort of List-specific optimization in the framework that PooledList can't benefit from. However, indexing into PooledList.Span is 84% faster than indexing into List, so there's that.

https://github.com/jtmueller/Collections.Pooled/blob/master/docs/benchmarks/netcoreapp2.2/List_Indexer_Types-report-github.md

I will look into this.

dzmitry-lahoda commented 5 years ago

Seems I found https://github.com/dotnet/corefx/blob/21e5d0f2f42110ae4500577588948d48346ff322/src/Common/src/CoreLib/System/ThrowHelper.cs . Would you be able to fix that? I.e. use ThrowHelper in indexer rather than exception? I guess there are other places to fix, not only one. So may be work on issues as whole?

dzmitry-lahoda commented 5 years ago

Making _items makes runs as fast as span. I am running with index inlining and will run with throw than.

dzmitry-lahoda commented 5 years ago

Did not help on indexer

       [MethodImpl(MethodImplOptions.AggressiveInlining)]
            get
            {

But Throw helped!!!!

                  Method |       Mean |    Error |    StdDev | Ratio | RatioSD |
------------------------ |-----------:|---------:|----------:|------:|--------:|
         ListIndexer_Int | 1,875.4 ns | 37.47 ns |  94.00 ns |  1.00 |    0.00 |
       PooledIndexer_Int | 1,869.6 ns | 37.35 ns |  71.95 ns |  0.99 |    0.07 |
 PooledIndexer_items_Int |   528.0 ns | 12.86 ns |  37.91 ns |  0.28 |    0.03 |
      ListIndexer_String | 3,756.4 ns | 73.98 ns | 160.83 ns |  2.00 |    0.12 |
    PooledIndexer_String | 3,803.3 ns | 91.27 ns | 269.11 ns |  2.02 |    0.17 |
     public T this[int index]
        {
            get
            {
                // Following trick can reduce the range check by one
                if ((uint)index >= (uint)_size)
                {
                    ThrowHelper.ThrowArgumentOutOfRange_IndexException();
                }
                return _items[index];
            }

        internal static void ThrowArgumentOutOfRange_IndexException()
        {
            throw new ArgumentOutOfRangeException("index");
        }
jtmueller commented 5 years ago

Wow, nice catch! I took all the ThrowHelper calls out because it was one less thing to port over and I didn't realize there were performance implications. (To be frank, I assumed the ThrowHelper stuff had more to do with localization of error messages than performance, and I didn't want to port an internal localization system over on top of everything else.) I would have thought that an exception I wasn't even throwing because the index was valid could not affect performance like that. I've learned something new!

I'll definitely be pulling a copy of ThrowHelper into Collections.Pooled and restoring the calls to it. Then I'll re-run all the benchmarks.

jtmueller commented 5 years ago

I've checked in updates that add ThrowHelper back to both PooledList and PooledDictionary, and in every benchmark I've run so far this has fixed the performance regressions. I've yet to see a case where the pooled collection fails to have as-good-or-better performance. Thanks again for figuring out what I was overlooking!

I haven't published an updated NuGet package yet. I want to re-run all the benchmarks on both platforms and check for any nasty surprises first, and I probably won't be done with that until sometime tomorrow in my time zone.