Open asfimport opened 3 years ago
Dhruv Vats / @dhruv9vats: Is there still interest in this? If so, I'd be happy to give this a go.
Also, this will go into {}hash_aggregate{
}, right? And could be named something like hash_count_distinct_estimate
or {}hash_count_distinct_hll{
}?
ZMZ91 / @ZMZ91: Sure. We'd like to have a hash_count_distinct_hll for a proximate result in many real cases.
take
Having a version of the aggregation kernel count distinct using HyperLogLog may be useful.
Note: The implementation should support the merge operator.
cc @ianmcook @lidavidm
Some resources/links: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/ https://github.com/facebookincubator/velox/tree/main/velox/aggregates/hyperloglog
Reporter: Percy Camilo Triveño Aucahuasi / @aucahuasi
Note: This issue was originally created as ARROW-14158. Please see the migration documentation for further details.