apache / datasketches-cpp

Core C++ Sketch Library
https://datasketches.apache.org
Apache License 2.0
223 stars 71 forks source link

Examples of Tuple Sketches #296

Closed MayankKr8 closed 2 years ago

MayankKr8 commented 2 years ago

Do we have some online code examples for theta sketches in python? Also can I use theta sketch to store precomputed aggregates for machine learning models?

jmalkin commented 2 years ago

We don't currently offer tuple support in python. Unless the request is just for the Array of Doubles sketch that (I think) we already provide in C++, which is probably not too bad, we haven't yet managed to get to where we can define a Summary and rules for how to combine them via python. Or how to deserialize these things.

I'm playing with the var opt sketch as its only constraint currently is a lack of (de)serialization support in python. I apparently need to build GDB with python support so I can investigate a segfault. When I (eventually) get that working, it should provide a blueprint for how to follow a similar process for other sketches. But until then, we have no ETA.

MayankKr8 commented 2 years ago

Thanks.

jmalkin commented 1 year ago

@MayankKr8 We just merged tuple sketches for python into the main branch. No official release yet, so you can't install from pypi but you can experiment if you build the package yourself.

jmalkin commented 1 year ago

The official release is (finally) live! and v4.1.0 provides this support.