cocreature / thrill

Thrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
http://project-thrill.org
Other
0 stars 0 forks source link

64bit hashes #3

Closed cocreature closed 7 years ago

cocreature commented 7 years ago

This is the first improvement suggested in the google paper. Should be easy to implement since we already get a 64bit hash from our hash function. We can also drop the large range correction once we have implemented this.

@TiFu Do you want to take this? Might be a good way to familiarize with the code I’ve written so far.

TiFu commented 7 years ago

Seems like this improves the error rate for large p, while it increases the error rate for low p values.