Overloading the hash function

GoogleCodeExporter commented 8 years ago

When using SparseHash as a container to store user-defined objects, it is 
necessary to specialize the hash() function that class of objects.  
Unfortunately, it is difficult to do this in a generic way (different compilers 
use different syntax).

It would be great if we could use the #defines that were set in 
src/google/sparsetable/sparseconfig.h during configuration.

HASH_NAMESPACE and SPARSEHASH_HASH are already defined, however, we're still 
missing something in the case HASH_NAMESPACE is std::tr1.  In that case, the 
syntax to overload the hash() function is:

namespace std {
  namespace tr1 {
    template<>
    struct hash<...> {
      size_t operator()(....) const {
          return ...;
      }
    }
  }
}

It would be great if we could write the following

HASH_FUN_OPEN_NAMESPACE
  template<>
  struct HASH_FUN<...> {
    size_t operator()(....) const {
      return ...;
    }
   }
HASH_FUN_CLOSE_NAMESPACE

where of course

-HASH_FUN would be the hash function (hash for gcc, hash_compare for VS).
-HASH_FUN_OPEN_NAMESPACE would be "namespace std { namespace tr1 {" for GCC 
(new version) or "namespace __gnu_cxx {" for GCC old version or "namespace 
stdext" for VS.
-HASH_FUN_CLOSE_NAMESPACE would be either "}" or " } } ".

Note that there is currently a SPARSEHASH_HASH definition that included both 
the namespace and the hash function, however, you cannot use this information 
to specialize the hash function.

It would be really great if you could provide us with such definitions.  It 
would garantee the user that he is using exactly the same hash function as 
SparseHash, and it would save him from the burden of figuring out the namespace 
himself.

I know that the sparseconfig.h file warns about using these definitions in user 
code, however, I think it would be great if SparseHash could provide the user 
with definitions to specialize the hash() function in a portable way.

Original issue reported on code.google.com by jan.fost...@gmail.com on 7 Jun 2011 at 2:47

GoogleCodeExporter commented 8 years ago

Thanks for the suggestion!  I don't know that we want to get into the business 
of providing compatibility shims.  It's nice that the STL separates out the 
concept of a hash functor from the hashtable implementation, so we can provide 
a new hashtable implementation without needing to worry about the functor.  But 
as you noticed, it's not super-nice, since the hashtable implementation 
actually depends on the API of the hash functor, which differs amongst STLs.

sparse_hash_* and dense_hash_* require a particular API, which we document.  So 
I'm not sure how much the flexibility you describe would help the program.

Actually, thinking about it a bit more: I'm not a fan of overloading hash<> at 
all -- it causes all sorts of problems, like you point out.  It's better just 
to declare your own functor, however you'd like, and pass it as a template 
argument to sparse_hash_set<> or dense_hash_set<>.  So I'd rather not add any 
functionality that makes it easier to overload hash<>.

Original comment by csilv...@gmail.com on 8 Jun 2011 at 1:01

Changed state: WontFix
Added labels: Priority-Low, Type-Enhancement

GoogleCodeExporter commented 8 years ago

You're absolutely 100% right.  Providing a user-defined hash function as a 
template parameter to the containers works on my compilers, without having to 
use any #ifdefs whatsoever.

Original comment by jan.fost...@gmail.com on 8 Jun 2011 at 7:54

epitzer / sparsehash

Overloading the hash function #69