skarupke / flat_hash_map

A very fast hashtable
1.69k stars 186 forks source link

a bug in emplace_new_key() at line 824 flat_hash_map.hpp #23

Open shanbaoyin opened 5 years ago

shanbaoyin commented 5 years ago

I have learned your code, and is there a bug in emplace_new_key() at line 824 flat_hash_map.hpp? I'm not sure, pls help to check. i think the logic of emplace_new_key(int8_t distance_from_desired, EntryPointer current_entry, Key && key, Args &&... args) is to put args to current_entry and find a new bucket for current_entry->value. At line 861, you swap to_insert with result.current->value, code as below: if (distance_from_desired == max_lookups) { swap(to_insert, result.current->value); grow(); return emplace(std::move(to_insert)); } may be logic is wrong in following case. Given there are 7 buckets, the content of each bucket as below:

bucket id content
bucket1 DIB is 0, value is A
bucket2 DIB is 1, value is B
bucket3 DIB is 0, value is C
bucket4 DIB is 1, value is D
bucket5 DIB is 2, value is E
bucket6 DIB is 3, value is F
bucket7 DIB is 4, value is G

Given param 'current_entry' of emplace_new_key() is point to bucket1, to_insert value is X, and max_lookups is 4. so the code logic is

  1. swap(to_insert, current_entry->value); so bucket1's value is X, and to_insert value is A, iterator result point to bucket1.
  2. current_entry shift to bucket2, but DIB(A) is 1, so move to next;
  3. current_entry shift to bucket3, swap(to_insert, current_entry->value), now bucket3's value is A, and to_insert value is C.
  4. find a new bucket to store C, and program will reach condition 'distance_from_desired == max_lookups' and run to line 861, now swap(to_insert, result.current->value) will store C to bucket1, i think it's a bug to store C to bucket1. For value C, it can't store in bucket before bucket3.
skarupke commented 5 years ago

It's a bit weird, but it's not a bug because the container re-allocates right after the swap. So yes, in your step 4 the invariants of the container are violated, but that's fine because step 5 is "grow the container and throw all the old memory away" and that restores all the invariants.

Now you might have a point though if the allocator throws when the container tries to grow. In that case that code might leave things in the wrong order. It's an interesting case. Not at all easy to handle since at this point a lot of modifications could have already happened.

The only thing I can think of doing would be to put a try/catch around the grow call like this: try { grow(); } catch(...) { rebuild_entire_container_using_current_memory(); throw; }

where the function rebuild_entire_container_using_current_memory() would have to be written. It would be pretty slow, but it would only trigger if you run out of memory...

I'll have to think about it for a while. The container is not great about handling exceptions, but I do try to handle the case of the allocator throwing and of the object throwing when it's first being created. So I should handle this one...

shanbaoyin commented 5 years ago

It's a bit weird, but it's not a bug because the container re-allocates right after the swap. So yes, in your step 4 the invariants of the container are violated, but that's fine because step 5 is "grow the container and throw all the old memory away" and that restores all the invariants.

Now you might have a point though if the allocator throws when the container tries to grow. In that case that code might leave things in the wrong order. It's an interesting case. Not at all easy to handle since at this point a lot of modifications could have already happened.

The only thing I can think of doing would be to put a try/catch around the grow call like this: try { grow(); } catch(...) { rebuild_entire_container_using_current_memory(); throw; }

where the function rebuild_entire_container_using_current_memory() would have to be written. It would be pretty slow, but it would only trigger if you run out of memory...

I'll have to think about it for a while. The container is not great about handling exceptions, but I do try to handle the case of the allocator throwing and of the object throwing when it's first being created. So I should handle this one...

@skarupke , Thanks for your replay. Yes, this problem only happens if grow() throws exception. If we record element's original position before doing swap(), so we could restore container's state before execute grow(), and then there is no need to catch exception of grows() because the container keep it's original state. And i think the record does not affect performance because max_lookups is small.