Open ufoscw opened 4 years ago
@ufoscw Originally, the "countd" function has converted to a cardinality aggregation. However, due to performance and accuracy issues, I changed to using thetaSketch type(#967). So, I am wondering why do you need 2 types?
In druid, thetaSketch has optional parameter (size) for accuracy issues. "countd" funcion does not have optional parameter now and we need to change the type of countd. We also found cardinality aggregation is faster than thetaSchetch from @navis .
Is your feature request related to a problem? Please describe. Supports 2 types of countd Aggregator (include ifcountd) The existing Aggregator type should be changed to support two types: "thetaSketch" and "cardinality" according to parameters.
Describe the solution you'd like
cardinality expression : counts([fieldName]) "aggregations": [ { "type": "cardinality", "name": "aggregationfunc_000", "fieldNames": [ "hashed_user_id" ], "byRow": true } ], "postAggregations": [ { "type": "math", "name": "MEASURE_2", "expression": "ROUND(aggregationfunc_000)", "finalize": true } ],
thetaSketch expression : countd([fieldName], size) "aggregations": [ { "type": "thetaSketch", "name": "MEASURE_2", "fieldName": "hashed_user_id", "size": 20000, "shouldFinalize": true } ],
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.