thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.12k stars 2.1k forks source link

Implement projection pushdown #5832

Open fpetkovski opened 2 years ago

fpetkovski commented 2 years ago

Is your proposal related to a problem?

When selecting series from stores, the querier receives all labels from all series for the given matchers. Most PromQL expressions include one or more aggregations which discard most labels. Because of this, both stores and queriers waste a lot of resources marshaling, transfering and unmarshaling data that is not used in queries.

Describe the solution you'd like

We can implement projection pushdown so that a querier can request only certain labels from series. After selecting series from storage, stores would prune labels that are not necessary for a query before returning the result.

Describe alternatives you've considered

Query pushdown is an alternative, but the two optimizations seem complementary.

Additional context

Deduplication can be problematic since series will not be unique after unused labels are pruned. To solve this, we can attach a special label with the hash of the series labels (excluding replica labels), similar to how query pushdown markers are used. Queriers would then always use this special label for deduplication so that it does not leak to the end user.

We first might need to resolve https://github.com/thanos-io/thanos/pull/5796 so that stores get access to replica labels.

yeya24 commented 2 years ago

This sounds a very interesting idea! Can we use Grouping from the select hints to let the store layer know what labels do we need? https://github.com/thanos-io/thanos/blob/main/pkg/query/querier.go#L262 But the hints is currently only enabled along with query pushdown, we need to always enable it if we want to do this optimization.

fpetkovski commented 2 years ago

We have to see how the engine sets those hints, I think for something like sum by (pod) (rate..) we will not propagate the grouping labels. The engine only looks at the closest function or aggregation, and in this case that would be rate. But maybe the hints API is flexible and we should use it as it fits us, especially from the new engine.

yeya24 commented 2 years ago

Makes sense to me. We could try the hints here first https://github.com/thanos-io/thanos/blob/main/pkg/store/storepb/rpc.proto#L108

yeya24 commented 2 years ago

Queriers would then always use this special label for deduplication so that it does not leak to the end user.

Btw is this possible? If we are doing aggregations then seems that special label will always be dropped by the query engine?

fpetkovski commented 2 years ago

Yes, but it needs to be dropped sooner than it hits the engine so that data is deduplicated. Otherwise the engine see 2x or 3x more series and produce higher numbers. I think we have a similar label for query pushdown that @GiedriusS introduced.