skarupke / flat_hash_map

A very fast hashtable
1.69k stars 186 forks source link

Unsafe with attacker controlled keys #17

Open strigeus opened 5 years ago

strigeus commented 5 years ago

It's possible to trigger explosive memory usage, when carefully picking keys, which causes the hash table to grow for every insertion eventually resulting in an out of memory condition.

Pseudo code:

static constexpr size_t jump_distances[]
{
  0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,

  21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231,
  253, 276, 300, 325, 351, 378, 406, 435, 465, 496, 528, 561, 595, 630,
  666, 703, 741, 780, 820, 861, 903, 946, 990, 1035, 1081, 1128, 1176,
  1225, 1275, 1326, 1378, 1431, 1485, 1540, 1596, 1653, 1711, 1770, 1830,
  1891, 1953, 2016, 2080, 2145, 2211, 2278, 2346, 2415, 2485, 2556,

  3741, 8385, 18915, 42486, 95703, 215496, 485605, 1091503, 2456436,
  5529475, 12437578, 27986421, 62972253, 141700195, 318819126, 717314626,
  1614000520, 3631437253, 8170829695, 18384318876, 41364501751,
  93070021080, 209407709220, 471167588430, 1060127437995, 2385287281530,
  5366895564381, 12075513791265, 27169907873235, 61132301007778,
  137547673121001, 309482258302503, 696335090510256, 1566753939653640,
  3525196427195653, 7931691866727775, 17846306747368716,
  40154190394120111, 90346928493040500, 203280588949935750,
  457381324898247375, 1029107980662394500, 2315492957028380766,
  5209859150892887590,
};

// assume STL has a crappy hash function
struct hasher { size_t operator()(const size_t& _Keyval) const { return _Keyval; } };

template<typename T>
  size_t invert_fibonacci_hash(T &a, size_t y) {
    // return a value x, so that fibonacci_hash_policy.index_for_hash(x) returns y
    // i don't provide the implementation here, but if you change
    // fibonacci_hash_policy::index_for_hash to "return hash & num_slots_minus_one;"
    // you can just return y here.
  }

  ska::bytell_hash_map<size_t, int, hasher> hashmap;

  // Consumes around 2GB after 19 iterations and crashes with out of memory
  for(size_t i = 8; i < 32; i++) {
    for (size_t j = 0; j < 126; j++)
      hashmap[invert_fibonacci_hash(hashmap, jump_distances[j])] = 1;
    hashmap[invert_fibonacci_hash(hashmap, 1 << i)] = 1;
  }
Bulat-Ziganshin commented 5 years ago

It may be considered as good behavior - in case of attack it just "explodes" and probably OS will terminate it for using too much RAM.

Alternative attack will carefully select keys so they all are hashed to the same bin.

If you really need to protect from DDoS attacks, you should use hash functions specifically designed for DDoS mitigation. Look at siphash. Or you can use just SHA2, but it will be slower.

portaloffreedom commented 1 year ago
// assume STL has a crappy hash function
struct hasher { size_t operator()(const size_t& _Keyval) const { return _Keyval; } };

https://github.com/gcc-mirror/gcc/blob/713ec97e593bd4d9915a13bc4047f064fec0e24a/libstdc%2B%2B-v3/include/bits/functional_hash.h#L114-L122

I have bad news, this is exactly the case on Linux. Which triggered an out of memory crash just by using [-10,-9,-8 .... 8,9,10] as keys

portaloffreedom commented 1 year ago

As far as I can tell, boost has this same exact behaviour as well: https://github.com/boostorg/container_hash/blob/53c12550fa11221975f58a6c23581b4563153e04/include/boost/container_hash/hash.hpp#L363-L367

            template <> struct basic_numbers<int> :
            boost::hash_detail::enable_hash_value {};
    template <typename T>
    typename boost::hash_detail::basic_numbers<T>::type hash_value(T v)
    {
        return static_cast<std::size_t>(v);
    }
sh1boot commented 5 months ago

This seems easy to fix. Permute the hashes against a per-table constant. Every time you rehash you change the constant, but if the load ratio doesn't demand it then just rehash without growing the table, right?