OpenTSDB / opentsdb

A scalable, distributed Time Series Database.
http://opentsdb.net
GNU Lesser General Public License v2.1
4.99k stars 1.25k forks source link

Is there an efficient way to retrieve a list of unique tag values for a given tagk? #1960

Open benelgiac opened 4 years ago

benelgiac commented 4 years ago

Hi

I'm using OpenTSDB with Grafana a lot but I'm struggling with what seems to be a common problem that doesn't have an efficient solution, at least as per my understanding.

When using a dashboarding tool, it is very useful to be able to populate some UX controls (i.e. drop-down menus) with the list of tag values for a particular tagk (+metric, ideally).

Grafana seems to be using the Lookup API in order to populate that list. This API will retrieve a (limited) list of all possible combinations of tagk/tagv for a specific metric (or combination metric/tagk=tagv, if provided). Unfortunately, for real use cases where you have at least 4/5 tags, this becomes quickly unusable because the list of returned data points becomes huge even for relatively low tagk cardinalities.

Let me clarify with an example and please correct me if I'm wrong. If I have a metric with three tagk, each with cardinality 100, the lookup API will retrieve all the possible combinations of tagv. A list of size 100100100=1e6 will be retrieved for an information that would only take three 10e2 long lists.

Even though this is only performed once usually, with real-world data this quickly becomes a problem in terms of query time, amount of data transferred from the backend and memory usage and it just seems unnecessary.

Moreover, you never know if the list you get is complete, unless you know exactly the cardinality of each of your tags and adjust your lookup limit accordingly.

My question is: is there a better way to do that? Is there something that Grafana could do to leverage some other OpenTSDB API or is this just "the way it is" for now?

Thanks for your time.

manolama commented 3 years ago

Unfortunately it's just "the way it is" for now. The reason being that we don't have any good schema built yet to tie keys to their values. The old Graphite tree code is the closest but that's buggy and inefficient. There is some partial ElasticSearch code that needs completion to handle this use case (needs the meta query API, write path to ES and tweaks to Grafana to support the queries).

3.x has a new meta store we're working on and a meta API that can power these queries. We'll add that support to Grafana when it's solid.

In the mean time, sorry for the pain. I know it sucks :(