Open tillrohrmann opened 1 year ago
Using a good hash function like SHA256 truncated to 64 bit, is a common practice, and has the following benefits:
Specifically about distribution:
storage_query
a partition key is used as a low watermark in a symmetric hash join.awakables
we use the partition key for prefix scans to find an awakable with a specific invocation_id
.From a quick servery Murmur
is commonly used (Flink, Cassandra) and looking at SMHasher murmur and xxhash (which is currently used) are comparable.
We should consider to replace our current hash partitioning algorithm (xxhash) with a cryptographic one in order to not leak internal information and to be not subject to targeted attacks by key/id forging.