Closed jakoschiko closed 4 weeks ago
Regarding the inserts, the plan was to reduce the number of branches as much as possible. Being able to use it in const
context is more useful and couldn't detect any performance reduction from the benchmarks.
Would be great if you could rebase on master for the CI fixes :)
This seems to be a failed test this time. For some reason the ordering of debug print has changed.
Yes, I'll investigate this later. I'm not yet done with the implementation, thus this PR is on draft. First I'll need to think about some edge cases.
I'd would like to add more integration tests using proptest
. Should I add them to ./tests
or create another folder and crate in ./integration_tests
?
Create a new crate is the cleaner option I would say.
At the moment it look like we perform two checks for load rate. One at the start of insert and one after. Might be better to only ensure load rate the beginning with count + 1. The downside of this approach is that sometimes we reallocate even if we don't need. In theory at least. Size increase with the power of 2 so we rarely actually need to increase the size. Still might cause issues if you set a capacity and load rate but we still reallocate when we are on the capacity. Unless of course the initial capacity takes the extra element into account.
Just some thoughts, feels like I'm mind dumping a bit ^^
Create a new crate is the cleaner option I would say.
IMO this is a little bit unusual. But no problem, will do it. I'll create a separate PR for that.
At the moment it look like we perform two checks for load rate. One at the start of insert and one after. Might be better to only ensure load rate the beginning with count + 1. The downside of this approach is that sometimes we reallocate even if we don't need. In theory at least. Size increase with the power of 2 so we rarely actually need to increase the size. Still might cause issues if you set a capacity and load rate but we still reallocate when we are on the capacity. Unless of course the initial capacity takes the extra element into account.
I had similar thoughts. Though the second check is not really a check, more like a postponement of the actual check. And it can be implemented branchless because it's only about setting the bool
. However, I'm not feeling well making changes to these functions before adding more proptest
tests.
Just some thoughts, feels like I'm mind dumping a bit ^^
Always happy about that. Thanks!
I'm happy with the current state.
Benchmark results:
master (3668ca25d94a)
test tests::u64_get_built_in ... bench: 80,048.49 ns/iter (+/- 1,120.25)
test tests::u64_get_indexmap ... bench: 104,913.54 ns/iter (+/- 707.30)
test tests::u64_get_intmap ... bench: 12,750.83 ns/iter (+/- 102.93)
test tests::u64_insert_built_in ... bench: 108,822.23 ns/iter (+/- 434.90)
test tests::u64_insert_built_in_without_capacity ... bench: 282,702.10 ns/iter (+/- 2,019.04)
test tests::u64_insert_indexmap ... bench: 136,701.40 ns/iter (+/- 882.86)
test tests::u64_insert_intmap ... bench: 44,715.89 ns/iter (+/- 707.91)
test tests::u64_insert_intmap_checked ... bench: 42,648.63 ns/iter (+/- 1,921.58)
test tests::u64_insert_intmap_entry ... bench: 41,845.32 ns/iter (+/- 967.49)
test tests::u64_insert_intmap_without_capacity ... bench: 509,396.55 ns/iter (+/- 3,110.16)
test tests::u64_resize_intmap ... bench: 31,002.43 ns/iter (+/- 79.86)
PR branch (82eaac8da302)
running 11 tests
test tests::u64_get_built_in ... bench: 75,040.76 ns/iter (+/- 614.64)
test tests::u64_get_indexmap ... bench: 99,090.88 ns/iter (+/- 703.46)
test tests::u64_get_intmap ... bench: 11,704.59 ns/iter (+/- 149.24)
test tests::u64_insert_built_in ... bench: 108,497.07 ns/iter (+/- 206.74)
test tests::u64_insert_built_in_without_capacity ... bench: 262,929.10 ns/iter (+/- 679.97)
test tests::u64_insert_indexmap ... bench: 118,672.51 ns/iter (+/- 5,870.27)
test tests::u64_insert_intmap ... bench: 34,308.19 ns/iter (+/- 413.66)
test tests::u64_insert_intmap_checked ... bench: 37,221.59 ns/iter (+/- 551.35)
test tests::u64_insert_intmap_entry ... bench: 59,495.75 ns/iter (+/- 522.95)
test tests::u64_insert_intmap_without_capacity ... bench: 486,095.35 ns/iter (+/- 2,913.17)
test tests::u64_resize_intmap ... bench: 29,261.58 ns/iter (+/- 152.12)
Changes:
IntMap::new
is now const, thus it doesn't allocate memory.IntMap::with_capacity
doesn't allocate memory if capacity is zero. Previously it allocated space for at least one element.IntMap::insert
andIntMap::insert_checked
didn't need to increase the cache at the beginning. This changed. I added another fieldIntMap::needs_increase
to keep track if an increase is necessary. Now we handle this flag at the beginning of these functions and set it at the end.IntMap
properly. I added shortcuts at the beginning of these functions.Closed #49