Open snleee opened 1 year ago
+1
I am hoping this would help us generate HLL / sketches during ingestion (from raw data) and roll up at the same it.
+1
Would this apply to both offline and real-time ingestion? With regard to theta sketches, they take an additional parameter that controls the number of retained entries, which ultimately affects both size and accuracy. This might be worth taking into consideration as an argument to the transform function.
It would be nice to have a data type abstraction associated with metrics - in this way users could create additional data types and know exactly what functions are required to support it through the stack.
@mayankshriv have you got any links to PRs that have introduced similar features?
What we would like to do is create a sketch metric from string dimensions on ingestion and reduce the number of rows stored by orders of magnitude.
It would be handy if we have sth like the following:
toHLL, toThetaSketch
will need some inputs for sketch configurations.