citusdata / postgresql-hll

PostgreSQL extension adding HyperLogLog data structures as a native data type
http://blog.aggregateknowledge.com/2013/02/04/open-source-release-postgresql-hll/
Apache License 2.0
1.13k stars 116 forks source link

Support for adding pre-computed HLL entries into an HLL table #74

Open david-streamlio opened 5 years ago

david-streamlio commented 5 years ago

I have developed a framework that periodically produces several HLL entries using the java-hll library, and am looking for a way to insert them directly into a Postgres table so as to keep a running total, etc.

The java-hll library README has the following entry, but I am unsure how to take the generated hex output and insert it into an existing table:

"Writing an HLL to its hex representation of storage specification, v1.0.0 (for example, to be inserted into a PostgreSQL database):

final byte[] bytes = hll.toBytes();
final String output = "\\x" + NumberUtil.toHex(bytes, 0, bytes.length)

"

Any advise on how to leverage both of these technologies would be greatly appreciated, as it seems that the additive nature of the data structure makes this scenario seem plausible.

david-streamlio commented 4 years ago

Any update on this?

Sieboldianus commented 4 years ago

Not sure about the java implementation, but this should apply independent. Normally, you can simply upsert hll sets in postgres supplying hll as strings and PG will know they're byte hlls:

INSERT INTO temp."counts" (
    id, hll_set)
VALUES ('1', '\x128b7f228fc76c29e48bb75004ce75b7d0a3a75927ce0250136be3'),
       ('2', '\x128b7f03a50af9a15d595156ea58575fd8a25d5af6fcf4119f7d2b')
ON CONFLICT (id)
DO UPDATE SET
    hll_set = COALESCE(hll_union(EXCLUDED.hll_set,
        temp."counts".hll_set), hll_empty());