JesperAxelsson / rust-intmap

Specialized hashmap for u64 keys
MIT License
30 stars 10 forks source link

Add Entry API #27

Closed jakoschiko closed 2 years ago

jakoschiko commented 2 years ago

tl;dr I implemented an Entry API. The implementation uses unsafe. If that's not okay, please panic.

Solves #4

API changes

I implemented an Entry API very similar to std::collections::hash_map::Entry. I tried to focus on the most important functions, you can add more later. Especially I ignored all key related functions because in our case the key is a simple Copy type and does not need to be exposed by Entry.

Changes:

Entry implementation

It's actually very hard to implement Entry. I got a lot of lifetime issues (I look forward to Polonius...). And additional lookups to the cache were necessary. Unfortunately the performance got really bad and I was forced to use unsafe. I tried to ensure the safety as good as possible, but if you don't like unsafe in your library then we have a problem.

Other implementation changes

Performance

Because I implemented IntMap::insert with Entry, I was able to reuse your benchmarks to measure the performance. At the beginning the performance was really bad. Then I split up some functions, experimented with #[inline] and #[cold] and finally used unsafe, and now the performance looks quite good.

Before (commit de4c32a554ef869f34e6d69a1ed6bde0c0f24390):

test tests::u64_get_built_in    ... bench:      13,160 ns/iter (+/- 76)
test tests::u64_get_intmap      ... bench:       1,709 ns/iter (+/- 6)
test tests::u64_get_ordermap    ... bench:      17,592 ns/iter (+/- 95)
test tests::u64_insert_built_in ... bench:      16,487 ns/iter (+/- 79)
test tests::u64_insert_intmap   ... bench:       3,760 ns/iter (+/- 8)
test tests::u64_insert_ordermap ... bench:      18,670 ns/iter (+/- 87)

After (commit bfb1bdb3bdf3e248fb4a9ae4876c3fb571371f17):

test tests::u64_get_built_in    ... bench:      13,175 ns/iter (+/- 144)
test tests::u64_get_intmap      ... bench:       1,624 ns/iter (+/- 5)
test tests::u64_get_ordermap    ... bench:      17,716 ns/iter (+/- 184)
test tests::u64_insert_built_in ... bench:      19,674 ns/iter (+/- 174)
test tests::u64_insert_intmap   ... bench:       3,769 ns/iter (+/- 31)
test tests::u64_insert_ordermap ... bench:      18,692 ns/iter (+/- 142)
JesperAxelsson commented 2 years ago

First of all, thanks for taking your time to implement this!

I'm still pretty inexperienced with Rust and especially unsafe code. So yeah, the unsafe stuff does make me a bit uncomfortable ^^

My plan is to take a closer look this weekend or next week as time allows and compare to the official HashMap implementation. And to better understand the Entry api as well.

JesperAxelsson commented 2 years ago

Sorry for the delay in responding.

So some thoughts:

In the Entry implementation I think it might be better to just use the whole IntMap as argument to new, you already send most of the data anyway. This would also mean you can use the functions on the IntMap rather then having them as free functions.

In Insert, get_mut and remove I would still prefer the old implementation. While the speed is still the same you are putting more object on the stack then necessary.

Now on to the bugbear: unsafe! From my benchmarks I could not tell a difference between the unsafe and safe version. My suggestion would be to create a safe version that is enabled by default and hide the unsafe version behind a feature flag. This would also alleviate those who have string feelings against unsafe code.

If you have lost interest in the project (which I would understand) I might implement these changes myself when I get the time.

jakoschiko commented 2 years ago

I have currently no time for this, I would be glad if you could take over the task.