martinus / robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
https://gitter.im/martinus/robin-hood-hashing
MIT License
1.5k stars 143 forks source link

Complexity of the swap method #60

Closed asbai closed 4 years ago

asbai commented 4 years ago

According to the standard, the swap method of unordered_set and unordered_map should be constant O(1). See: https://en.cppreference.com/w/cpp/container/unordered_set/swap and https://en.cppreference.com/w/cpp/container/unordered_map/swap

But it looks like linear O(N) complexity in robin_hook?

    // Swaps everything between the two maps.
    void swap(Table& o) {
        ROBIN_HOOD_TRACE(this);
        using std::swap;
        swap(o, *this);
    }

Can it be optimized to O(1) complexity? Thanks :-)

martinus commented 4 years ago

Hi @asbai, the swap already is O(1). I took your issue as an opportunity to demonstrate this with nanobench. I've added a benchmark that calculates complexity of swap: https://github.com/martinus/robin-hood-hashing/blob/f577ca93efbbbe89793d03159e9ccfe9321c474f/src/test/unit/bench_swap.cpp#L18-L27

This benchmarks swap() with differently sized maps. It can be clearly seen that swap performance is always constant at 11.27 ns/op:

complexityN ns/op op/s err% ins/op cyc/op IPC total benchmark
10 11.27 88,736,885.89 0.0% 63.01 35.98 1.751 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
30 11.27 88,728,648.87 0.0% 63.01 36.01 1.750 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
70 11.28 88,658,752.21 0.1% 63.01 36.02 1.749 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
150 11.29 88,580,392.83 0.1% 63.01 36.06 1.747 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
310 11.29 88,578,516.95 0.1% 63.01 36.05 1.748 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
630 11.27 88,723,132.79 0.0% 63.01 36.00 1.750 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
1,270 11.27 88,731,841.38 0.0% 63.01 36.00 1.750 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
2,550 11.27 88,734,786.30 0.0% 63.01 36.01 1.750 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
5,110 11.27 88,736,885.89 0.0% 63.01 35.99 1.751 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>
10,230 11.27 88,735,566.10 0.0% 63.01 35.99 1.751 0.00 swap <robin_hood::unordered_flat_map<uint64_t, uint64_t>>

Additionally, this data can be used to fit different complexity curves to the data. Unsurprisingly, the best fitting curve with an error of only 0.1% is O(1):

coefficient err% complexity
1.12747e-08 0.1% O(1)
1.15522e-09 33.8% O(log n)
1.64601e-12 83.8% O(n)
1.20108e-13 85.6% O(n log n)
1.34512e-16 91.3% O(n^2)
1.18374e-20 93.4% O(n^3)

So no need to worry, swap already is O(1).

asbai commented 4 years ago

Ok, it's my fault. I'm not noticed that you have a Table& operator=(Table&& o) method, sorry~

I've only seen the Table& operator=(Table const& o) one what is immediately above the swap method :-)