Closed cjrh closed 2 years ago
Yes, the theta_sketch_add_item function is an internal function. It is a state transition function that is needed to define an aggregate function. Usually sketches are built from raw data for particular periods of time and particular combinations of dimensions. Those datasets become the base table (hypercube). Then the base table is queried for a particular reporting period with a subset of dimensions. That would be the union of sketches.
Thank you. I'll rethink the way I'm using it. My interface is a HTTP api that receives events and I'm adding those events directly to existing sketches with no intermediate storage. Perhaps I can introduce a buffer layer to accumulate events first into sketches and then add those intermediate sketches to the existing ones.
Sketches are to help processing big data. Building a sketch for one record, deserializing an existing sketch, performing a union, serializing the result - all this adds a lot of overhead. This seems counterproductive. I would suggest collecting raw data for some period of time (say, an hour). When the close of hour happens, produce an aggregated segment.
After creating the extension in Postgres, the routine
theta_sketch_add_item
exists, but it isn't usable from client SQL (the signature saysinternal
). What is the correct way to add an item to an existing theta sketch? Currently I'm doing this: