cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.92k stars 3.78k forks source link

sql: HyperLogLog native data type support #20430

Open bra-fsn opened 6 years ago

bra-fsn commented 6 years ago

I would like to recommend HyperLogLog (HLL, or some of its useful variants, HLL++, LL beta) support as a native data type. For a possible (Postgres) implementation see: https://github.com/aggregateknowledge/postgresql-hll

Jira issue: CRDB-5930

jseldess commented 6 years ago

Thanks for the suggestion, @bra-fsn. I'll loop in @awoods187 from our Product team. In the meantime, can you provide more details about potential use cases for this data type with cockroachdb?

bra-fsn commented 6 years ago

Sure. I have some entities and I want to count cardinality (and different cardinalities) for them and I need fast access to that number (without distinct counting a field or the need for the individual values). For more specific examples:

Thanks for considering.

github-actions[bot] commented 3 years ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!

awoods187 commented 3 years ago

This is still an open issue

danthegoodman1 commented 2 years ago

I would love to see this!

josephschorr commented 2 years ago

In SpiceDB (https://github.com/authzed/spicedb) we'd like to be able to get the estimated count of relationships matching certain criteria; as all our relationships are stored in a single sharded and shared table, we cannot make use of SHOW STATISTICS [USING JSON] FOR TABLE <table_name>.

Having a HyperLogLog data type OR the ability to get an estimated row count of an index would be super helpful.

RaduBerinde commented 2 years ago

CC @vy-ton

kant777 commented 2 years ago

+1 to Hyperloglog function

gmcquillan commented 1 year ago

Support for this feature would make it easier for folks who are using Postgres for segmentation-type problems, but who don't yet need a full blown Elasticsearch cluster. Right now, with no support, there's no bridge to CockroachDB, and that's a shame.

michae2 commented 7 months ago

This also might help us get closer to https://github.com/cockroachdb/cockroach/issues/41203