rust-lang / hashbrown

Rust port of Google's SwissTable hash map
https://rust-lang.github.io/hashbrown
Apache License 2.0
2.38k stars 275 forks source link

Insertion performance with arena allocators #515

Closed yanchith closed 5 months ago

yanchith commented 5 months ago

I've been building my own hashmap, and to my own surprise, it beat hashbrown on the following insertion benchmark:

hashbrown

    #[bench]
    fn bench_insert_hashmap_arena_512(b: &mut Bencher) {
        let data = make_bench_data(512);

        let mut m = allocate_memory();

        b.iter(|| {
            m.0.reset();

            let hasher = BuildHasherDefault::<ahash::AHasher>::default();
            let mut h: HashMap<u32, u32, _, _> = HashMap::with_hasher_in(hasher, &m.0);
            for &elem in &data {
                h.insert(elem, elem);
            }

            let len = h.len();
            mem::forget(h);

            len
        });
    }

hashtrie (my hashmap)

    #[bench]
    fn bench_insert_hashtrie_arena_512(b: &mut Bencher) {
        let data = make_bench_data(512);

        let mut m = allocate_memory();

        b.iter(|| {
            m.0.reset();

            let hasher = BuildHasherDefault::<ahash::AHasher>::default();
            let mut h: HashTrie<u32, u32, _, _> = HashTrie::with_hasher_in(hasher, &m.0);
            for &elem in &data {
                h.insert(elem, elem);
            }

            let len = h.len();
            mem::forget(h);

            len
        });
    }

Actually, hashbrown does not exhibit any improvement when using an arena allocator. It's insertion performance is the same as using Global.

I am wondering what could be done to improve this. AFAICT hashbrown doesn't call Allocator::grow, which could help here unless other allocations overlap. I also guess it needs to do some work after resizing, and it probably isn't possible to avoid this?

Note that doing with_capacity helps hashbrown a lot, but that isn't always an option.


For what it's worth, my hashmap is a Hash Trie based on this design: https://nullprogram.com/blog/2023/09/30/. It is arena friendly (doesn't grow, but rather allocates new buckets), and it doesn't support removal.

yanchith commented 5 months ago

Here's the benchmark results on the M1

test tests::insert_hashmap_0128        ... bench:       3,464 ns/iter (+/- 33)
test tests::insert_hashmap_0256        ... bench:       6,976 ns/iter (+/- 169)
test tests::insert_hashmap_0512        ... bench:      14,142 ns/iter (+/- 363)
test tests::insert_hashmap_1024        ... bench:      28,418 ns/iter (+/- 776)
test tests::insert_hashmap_2048        ... bench:      56,802 ns/iter (+/- 1,536)
test tests::insert_hashmap_4096        ... bench:     115,039 ns/iter (+/- 1,993)
test tests::insert_hashmap_8192        ... bench:     230,840 ns/iter (+/- 2,915)
test tests::insert_hashmap_arena_0128  ... bench:       3,334 ns/iter (+/- 51)
test tests::insert_hashmap_arena_0256  ... bench:       6,745 ns/iter (+/- 88)
test tests::insert_hashmap_arena_0512  ... bench:      13,930 ns/iter (+/- 352)
test tests::insert_hashmap_arena_1024  ... bench:      28,146 ns/iter (+/- 748)
test tests::insert_hashmap_arena_2048  ... bench:      56,707 ns/iter (+/- 1,419)
test tests::insert_hashmap_arena_4096  ... bench:     114,617 ns/iter (+/- 1,487)
test tests::insert_hashmap_arena_8192  ... bench:     230,836 ns/iter (+/- 2,066)
test tests::insert_hashtrie_0128       ... bench:       3,960 ns/iter (+/- 72)
test tests::insert_hashtrie_0256       ... bench:       8,371 ns/iter (+/- 74)
test tests::insert_hashtrie_0512       ... bench:      19,459 ns/iter (+/- 983)
test tests::insert_hashtrie_1024       ... bench:      38,765 ns/iter (+/- 1,812)
test tests::insert_hashtrie_2048       ... bench:      80,342 ns/iter (+/- 3,970)
test tests::insert_hashtrie_4096       ... bench:     170,788 ns/iter (+/- 13,437)
test tests::insert_hashtrie_8192       ... bench:     359,124 ns/iter (+/- 27,457)
test tests::insert_hashtrie_arena_0128 ... bench:         906 ns/iter (+/- 120)
test tests::insert_hashtrie_arena_0256 ... bench:       2,208 ns/iter (+/- 69)
test tests::insert_hashtrie_arena_0512 ... bench:       4,829 ns/iter (+/- 58)
test tests::insert_hashtrie_arena_1024 ... bench:      11,538 ns/iter (+/- 186)
test tests::insert_hashtrie_arena_2048 ... bench:      27,082 ns/iter (+/- 581)
test tests::insert_hashtrie_arena_4096 ... bench:      59,013 ns/iter (+/- 1,308)
test tests::insert_hashtrie_arena_8192 ... bench:     132,295 ns/iter (+/- 11,462)
Amanieu commented 5 months ago

We can't use Allocator::grow because growing the table requires re-hashing all elements and moving them into new slots. We need the destination table to be empty when we do this.

I don't think there is anything actionable that we can do here. Fundamentally, if you have a short-lived hash table then you should pre-allocate it using with_capacity to avoid unnecessary allocations while the table is being filled in.

yanchith commented 5 months ago

Makes sense. Thanks for clarifying!

Feel free to close this, if you feel like there's nothing that can be done.