mavam / libbf

:dart: Bloom filters for C++11
http://mavam.github.io/libbf
BSD 3-Clause "New" or "Revised" License
354 stars 89 forks source link

Have default hash function returns a `digest` #13

Open kamikat opened 8 years ago

kamikat commented 8 years ago

A more consistent signature of a default hash function.

mavam commented 8 years ago

The change to digest makes sense to me.

But you're introducing another subtle change silently: changing the digest type from size_t to uint64_t. So far, I'm going with the design of std::hash, which also returns size_t as a digest type, which is the most efficient unsigned type for a given architecture.

Frankly, the whole hashing design needs a major overhaul according to Howard's proposal.

kamikat commented 8 years ago

Oh... yes, the commit includes a change of digest's typedef to uint64_t to help clarify a storage property of digest. I'd better commit that separately.

size_t is semantically interpreted as a return type of sizeof, which should not be an appropriate type to describe a hash result. A digest type should be a better solution to that problem.

It comes to the definition of a digest. A type signature of uint64_t describes the storage property, which can help developer understanding digest type.

mavam commented 8 years ago

size_t is semantically interpreted as a return type of sizeof, which should not be an appropriate type to describe a hash result. A digest type should be a better solution to that problem.

I'm not saying that size_t is the best choice, it's just what's used by std::hash to date, and that was the rationale to go with it intially. If you look at the design of the standard library, you will see that size_t is used for all sorts of "size" related computations, e.g., std::vector<T>::size.

I fully agree that size_t should not be the digest type, it just lends itself as natural fit in the very awkward design of the C++ hashing.

I will merge the PR if you remove the single uint64_t typedef. And thanks for contributing! :)

kamikat commented 8 years ago

Thank you for the explanation. It seems that should be a design issue about C++ hashing.