jtmueller / Collections.Pooled

Fast, low-allocation ports of List, Dictionary, HashSet, Stack, and Queue using ArrayPool and Span.
MIT License
547 stars 48 forks source link

Performance of dictionary #5

Closed dzmitry-lahoda closed 5 years ago

dzmitry-lahoda commented 5 years ago

I looked into dozen of dict benchmarks. I do not see when pooled is faster. In some cases it is even slower for some reason. Could you please document?

jtmueller commented 5 years ago

@dzmitry-lahoda Pooled dictionary is marginally slower for some operations. The main performance benefit is when adding many elements to a dictionary that did not have its capacity explicitly set. In that scenario, adding can be as much as 40-50% faster.

The other benefit of PooledDictionary is memory allocations. This doesn't relate directly to the performance of any specific operation, but if you're creating and destroying many instances of the same type of dictionary, or adding a very large number of items to dictionaries, PooledDictionary gives you far less memory allocated, resulting in fewer GC pauses and memory spikes. This is in my opinion the primary motivation one would have to use PooledDictionary over Dictionary.

If you have any specific suggestions on how to speed up the PooledDictionary operations that are slower than the equivalent Dictionary operations, I would welcome a pull request. I've gotten the performance as close as I can, for the moment, figure out how to.

There's similar issues in PooledList: the indexer-get operation is 9-10 times slower than that of List, despite being the exact same code, as far as I can tell! I haven't found any way of closing that gap other than removing the bounds-check, which isn't a good idea. I suspect some sort of List-specific optimization in the framework that PooledList can't benefit from. However, indexing into PooledList.Span is 84% faster than indexing into List, so there's that.

Right now I'm focused mainly on finishing a few more of the collections I want to convert (Stack, Hashset) because I need to use them, but I would welcome feedback on what sort of documentation you're hoping to see for this sort of thing.

dzmitry-lahoda commented 5 years ago

When I do approach libraries what I want to see it that why should I use it. I used first hand Pooled list because it does saves some GC and has Span view (good for networked games). So, the text you have written in reply is kind of 50% of documentation which could-should probably in readme. It will allow any developer to evaluate use vs not without digging into details. So it is possibly close issues or convert, what may be considered stable and important to part of readme.

dzmitry-lahoda commented 5 years ago

Seems I have found root cause of #8 . Could you point me to some cases when PooledDictionary is slower?

jtmueller commented 5 years ago

The GetItem benchmark could stand to be better. It's a very common use-case, and it's about 20-30% slower in PooledDictionary. I'm hopeful that your ThrowHelper find will also improve this case.

Results: https://github.com/jtmueller/Collections.Pooled/blob/master/docs/benchmarks/netcoreapp2.2/Dict_GetItem-report-github.md

Benchmark code: https://github.com/jtmueller/Collections.Pooled/blob/master/Collections.Pooled.Benchmarks/PooledDictionary/Dict.GetItem.cs