cocreature / thrill

Thrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
http://project-thrill.org
Other
0 stars 0 forks source link

Bias correction #4

Closed cocreature closed 7 years ago

cocreature commented 7 years ago

This is the second improvement suggested in the google paper. The basic idea is to reduce the range until which LinearCounting is used and to introduce a new range at which a bias is subtracted from the estimate. The thresholds and the bias data used by google can be found in the appendix. It probably makes sense to simply reuse these values and maybe if we have the time calculate our own.