Ch_05: Graph Coloring- Hashing and Uniqueness query

mdalvi commented 6 years ago

Hello Clinton,

This is regarding,

Chapter 05: Graph Coloring Subsection 5.3 Rule under Hashing and Uniqueness

You have mentioned an implementation where you multiply the hash of state code with prime and XOR it with the hash of adjacent state code. @ return hash(self.Node) * 397 ^ hash(self.Adjacent)

What is the significance of using such hashing technique? Can you relate this to any reference? Why not something as simple as hash(self.Node + '-' + self.Adjacent) ?

handcraftsman commented 6 years ago

This is a very good question. The pattern you suggest, combine-then-hash, has two potential problems:

You will have to come up with a different pattern for combining each pair or set of objects you want to hash depending on their types.
Should you later change the type of Node or Adjacent, from string to integer for example, you will have to remember to change the hash function too.

The pattern I introduced, hash-then-combine, works with any combination of types that can be hashed. Additionally it eliminates both problems mentioned above. Since each value is hashed independently we do not have to be concerned if the type changes, therefore we also do not have to remember to update the hash function when the type changes.

As for the specific implementation of multiplying the left item (or aggregate item if there are more than 2) by a small prime then XORing the two parts, I learned that pattern more than 10 years ago and it has served me well. It preserves the order of the inputs while also also providing a reasonably good distribution of the hash values. That being said you might find the answers and comments for this StackOverflow question to be interesting.

mdalvi commented 6 years ago

Thank you for the explanation and the reference link :+1:

handcraftsman / GeneticAlgorithmsWithPython

Ch_05: Graph Coloring- Hashing and Uniqueness query #3