closes https://github.com/grafana/pyroscope/issues/3226

Overview

This PR is a POC change that allows requesting estimated label value cardinality alongside label names. This will allow the UI to make substantially fewer requests in order to populate the the label view (see below).

In order for the UI to render the labels bar, it needs to:

Make a /LabelNames request
For every name, make a /LabelValues request
Sort by cardinality (higher goes first)
Render the bar

This is quite expensive in terms of network cost to select the top N labels by cardinality.

Approach

This PR modifies the /LabelNames API to allow a caller to optionally request estimated cardinality:

message LabelNamesRequest {
  repeated string matchers = 1;
  int64 start = 2;
  int64 end = 3;
  optional bool include_cardinality = 4;
}

This would provide the following response:

message LabelNamesResponse {
  repeated string names = 1;
  repeated int64 estimated_cardinality = 2;
}

where names and estimated_cardinality are a 1:1 correspondence (if cardinality is requested). If cardinality is not requested, estimated_cardinality will be empty.

Note that the cardinality returned is an estimate. This PR does not deduplicate merged responses. This is fine, as the UI needs a way to select the top N labels by cardinality to render the label bar. It can make subsequent requests to /LabelValues to find exact counts. This should reduce network calls from 20 or more to ~7.

Performance

I ran three 10 minute load tests, each requesting 24 hours of data for service_name="fire-dev-001/querier.

/LabelNames with no cardinality
/LabelNames with cardinality
/LabelNames with no cardinality followed by N /LabelValues

TODO(bryan): Attach benchmark results

Test	# Reqs	p99 latency
`/LabelNames` with no cardinality	?	?
`/LabelNames` with cardinality	?	?
`/LabelNames` / `/LabelValues`	?	?

Explore Profiles view

Results

As expected, this PR puts more pressure on store gateways fulfilling a /LabelNames requests with cardinality as they need to also look up the label value when scanning postings. However, I think this tradeoff is acceptable as it reduces subsequent network calls the store gateways need to handle. So a /LabelNames request that is 2x more expensive alleviates the need for 15 more cheaper requests.

Alternatives

This PR provides one possible approach, which is to reuse the /LabelNames endpoint. An alternative is to create a brand new endpoint /LabelCardinality. This has the advantage of making performance characteristics much more predictable. E.g. a /LabelNames call with cardinality isn't suddenly much more expensive. We could also tune the endpoint to do the "top N" calculation server-side and provide the exact cardinality instead of an estimate.

A drawback, of course, is yet another API endpoint specifically to power a single component in the UI.

I'm wondering how the workflow changes. If I understand correctly, we will still be querying label values for the highest cardinality labels (the top 7, I believe). That makes me wonder if this approach is helpful in a common case:

Low cardinality label queries, which are supposed to be lightweight and fast, will mostly be eliminated.
High cardinality label queries, which contribute the most to latency, will need to be performed twice.

I'm not sure if the number of requests matters much (unless we're talking about thousands of them, of course). In our case, I think there are two key factors:

The overall latency caused by this component.
The overall CPU consumption from these queries.

I'm curious if we want to display a cardinality estimate of the labels. The current approach of summing values might give us somewhat trustworthy results if the values are normalized for the top/bottom K (though I'm not entirely sure). However, it's definitely not acceptable for display purposes.

I'd suggest using HyperLogLog++. There's a great Go implementation that should fit our use case perfectly: https://github.com/clarkduvall/hyperloglog. If we can tolerate a small skew of a few percent, we could potentially skip subsequent LabelValues calls entirely. Additionally, I'd consider displaying rough indicators instead of exact distinct counts – something like: 10+, 25+, 50+, 100+, 1k+, etc (with exact values for low cardinality labels, see below).

I'd consider adding a new API method.

Long time ago, we've been discussing a method to get summary on dimensions of a dataset, and we even drafted API – here's a slightly modified version:

message DimensionsSummary {
  uint64 cardinality = 1; // number of unique dimension combinations (series).
  uint64 samples = 2; // total number of samples in the time range.
  repeated DimensionSummary dimensions = 3; // distribution of samples across dimensions.
}

message DimensionSummary {
  string name = 1;
  uint64 cardinality = 2; // number of unique dimension combinations that include the dimension (series).
  uint64 samples = 3; // total number of samples in the time range that include the dimension (any value).
  uint64 dimension_cardinality = 4; // total number of distinct dimension values.
  repeated DimensionValueSummary values = 5; // some sample values (e.g., top/bottom K).
}

message DimensionValueSummary {
  string name = 1;
  uint64 cardinality = 2; // number of unique dimension combinations that include the value (series).
  uint64 samples = 3; // total number of samples in the time range that include the value.
}

We can simplify it further:

message LabelsSummary {
  repeated LabelSummary labels = 1;
}

message LabelSummary {
  // number of unique label values.
  uint64 label_cardinality_estimate = 1;
  // some sample label values (e.g., top/bottom K, or all values,
  // if a label only has very few values – say, less than 50).
  repeated string values = 2;
}

Please note that the API is public, so for internal communications, we should use something that handles HLL++. For example:

message LabelsSummary {
  repeated LabelSummary labels = 1;
}

message LabelSummary {
  // Opaque HyperLogLog representation that counts label values.
  bytes label_cardinality_estimate = 1;
  // If present, the summary includes all values: len(values) is the exact cardinality of the label. 
  repeated string values = 2;
}

Just to clarify: shouldn't "top K" actually be "bottom K (distinct)"? I assume we don't want to group by labels with the highest cardinality. This is a tricky point because we need to strike a balance – grouping by low cardinality labels doesn’t make much sense either. It seems like we need to identify the "core" of the multi-dimensional dataset (similar to what PCA or kernel methods do). This could be a very interesting research area but is definitely out of scope for the PR.

I'm not sure if the number of requests matters much

I'd argue the number of requests does matter. It may be easy for our backend to serve these requests, but each additional requests the UI needs to make is extra strain on the client's network and introduces a failure case the UI needs to handle. There definitely is incentive for the UI to minimize the number of requests it makes. We should respect the fact we need to serve clients who may have a slow or unreliable network and minimizing failure cases on the frontend is always a good thing.

I'm curious if we want to display a cardinality estimate of the labels

I need to reword the PR body, but these estimates are for ordering the visualizations and treating extremely high cardinality labels differently in the UI. There's two problems at play here:

We need to display the cardinality of each label in the labels bar.
We need to order the visualizations to match the ordering in the label bar. Relatedly, the UI needs to know general cardinality for labels to handle them in the query builder (i.e. if we have a request_id label with 50k values, we don't want to go ask for all 50k values and display them in the query builder dropdown).

For problem 1, we do need a precise (or pretty precise) number, which is likely out of scope for this PR (but might incidentally be solved, depending on our approach for problem 2). For problem 2, we just need numbers that allow us to order the label cardinalities w.r.t. to each other and to trim labels with cardinalities that are far too high.

I'd consider adding a new API method.

This suggestion is very compelling. I would like to explore this more!

Just to clarify: shouldn't "top K" actually be "bottom K (distinct)"?

Yes, your are correct in this observation; Bottom K (distinct) is a more appropriate description of this operation.

grafana / pyroscope

feat: Optional include label value cardinality in `/LabelNames` #3563

Overview

Approach

Performance

Results

Alternatives