martinus / unordered_dense

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
MIT License
898 stars 72 forks source link

Slow performance with Custom Type #118

Closed DanWillans closed 4 days ago

DanWillans commented 5 months ago

Hi,

I'm testing ankerl::unordered_dense::map against std::unordered_map for my custom type. When I profile std::unordered_map against ankerl::unordered_dense::map it's faster, am I using it incorrectly? As you can see in the code below for the std::hash specialisation I'm just returning the uint64_t because I can guarantee they're unique but in the custom_hash_unique_object_representation I'm instantiating wyhash.

If I just use a uint64_t ankerl is faster as the documentation and benchmarking shows.

Example code struct and hash for both std and ankerl.

  struct dan{
    uint64_t id;
    auto operator==(dan const& other) const -> bool {
        return id == other.id;
    }
  };

  namespace std {
  template<> struct hash<dan>
  {
    uint64_t operator()(const dan& id) const noexcept { return id.id; }
  };
  }// namespace std

  struct custom_hash_unique_object_representation {
      using is_avalanching = void;

      [[nodiscard]] auto operator()(dan const& f) const noexcept -> uint64_t {
          static_assert(std::has_unique_object_representations_v<dan>);
          return ankerl::unordered_dense::detail::wyhash::hash(&f, sizeof(f));
      }
  };

// Alternatively, doing the same as std::hash means very slow insertion into the map.
  struct custom_hash_unique_object_representation {
      [[nodiscard]] auto operator()(dan const& f) const noexcept -> uint64_t {
          return f.id;
      }
  };

Example code of usage

  constexpr int entity_count = 1000000;
  ankerl::unordered_dense::map<dan, uint64_t, custom_hash_unique_object_representation> ankerl_map;
  std::unordered_map<dan, uint64_t> std_map;

  for(uint64_t i = 0; i < entity_count; ++i){
    ankerl_map[dan{i}] = i;
    std_map[dan{i}] = i;
  }

  Timer timer_5("ankerl_map_find");
  timer_5.ResetStart();
  for(uint64_t i = 0; i < entity_count; ++i){
    auto& hey = ankerl_map[dan{i}];
    hey = 1;
  }
  timer_5.CaptureTimePoint();
  timer_5.PrintAverageTime();

  Timer timer_6("std_map_find");
  timer_6.ResetStart();
  for(uint64_t i = 0; i < entity_count; ++i){
    auto& hey = std_map[dan{i}];
    hey = 1;
  }
  timer_6.CaptureTimePoint();
  timer_6.PrintAverageTime();
martinus commented 4 days ago

Your benchmark is extremely simplistic, you could basically just use an std::vector instead. unordered maps are good at random stuff. Also, use this hash:

struct hash_a {
    using is_avalanching = void;

    [[nodiscard]] auto operator()(dan const& f) const noexcept -> uint64_t {
        return ankerl::unordered_dense::hash<uint64_t>{}(f.id);
    }
};

And use the same hash for both maps. Try a more real-live example than simply iterating a map that has inserted !