martinus / robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
https://gitter.im/martinus/robin-hood-hashing
MIT License
1.5k stars 142 forks source link

making unordered_map automatically choose between unordered_node_map and unordered_flat_map is error-prone #145

Open cf-natali opened 2 years ago

cf-natali commented 2 years ago

Hi!

First, thanks for the amazing work!

I understand the motivation, but I think that having unordered_map try to automatically pick between the node and flat implementations is error prone:

https://github.com/martinus/robin-hood-hashing/blob/9145f963d80d6a02f0f96a47758050a89184a3ed/src/include/robin_hood.h#L2519

template <typename Key, typename T, typename Hash = hash<Key>,
          typename KeyEqual = std::equal_to<Key>, size_t MaxLoadFactor100 = 80>
using unordered_map =
    detail::Table<sizeof(robin_hood::pair<Key, T>) <= sizeof(size_t) * 6 &&
                      std::is_nothrow_move_constructible<robin_hood::pair<Key, T>>::value &&
                      std::is_nothrow_move_assignable<robin_hood::pair<Key, T>>::value,
                  MaxLoadFactor100, Key, T, Hash, KeyEqual>;

The problem is that both implementations have different performance characteristics, and more importantly different iterator/reference invalidation semantics.

For example AFAICT code like this:

m["a"] = m["b"];

is safe when inserting "a" into m causes a resizing with the node-based implementation, but not with the flat one, resulting in UB.

And more generally, code which works fine with the node-based one might break subtly with the flat one.

It can be quite confusing since subtle changes to the key or value types - which can occur for example when compiling with a different compiler/libraries - can cause a change in the underlying implementation used.