pmem / libpmemobj-cpp

C++ bindings & containers for libpmemobj
https://pmem.io
Other
107 stars 76 forks source link

Add std::hash<pmem::obj::string> specialization #1007

Open karczex opened 3 years ago

karczex commented 3 years ago

FEAT: Add std::hash\<pmem::obj::string>

Rationale

It's needed to easily use pmem::obj::concurrent_hash_map with pmem::obj::string as a key.

API Changes

Add Default implementation of hasher for pmem::obj::string. Currently user have to provide custom one.

Implementation details

Probably, it would be enough to move implementation of fibonacci hashing from tests

https://github.com/pmem/libpmemobj-cpp/blob/2df10e8a29f130808fa79750b9b4fe8f6fd5fba2/tests/concurrent_hash_map/concurrent_hash_map_string_test.hpp#L38

jan-konczak-cs-put commented 3 years ago

Probably, it would be enough to move implementation of fibonacci hashing from tests

That's a terrible idea from my (==user) point of view.

If you do, then please add a hash that enables lookups and insertions by std::string or std::string_view rather than implementing an incompatible one. So far this works as a cheap replacement:

struct MyComp {
    template <typename T>
    size_t operator()(T a, const pm::string & b) const {
        return a == std::string_view(b.data(), b.length());
    }
};

struct MyHash{
    auto operator()(const pmem::obj::string & str) const {
        return operator()(std::string_view(str.data(), str.length()));
    }
    template<typename T>
    size_t operator()(T str) const {
        return std::hash<T>{}(str);
    }
    typedef MyComp transparent_key_equal;
};

and this allows for:

pm::concurrent_hash_map<pm::string, pm::string, MyHash, MyComp> kvmap;

std::string key, value;

decltype(kvmap)::const_accessor acc;
bool found = kvmap.find(acc, key);

kvmap.insert_or_assign(key, value);
igchor commented 3 years ago

@jan-konczak-cs-put right, pmem::obj::string and std::string with the same content should hash to the same value. For C++17 we can just implement std::hash<pmem::obj::string> as you did. The problem is only for older standards (there is no easy way to use existing std hash on a sequence of characters). But maybe, we don't have to provide it for older standards...