thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.73k stars 2.04k forks source link

Limit data bytes fetched per query and per store #5750

Open yeya24 opened 1 year ago

yeya24 commented 1 year ago

Is your proposal related to a problem?

In a multi-tenant environment, Thanos Querier is easy to get OOM killed without any limits/protection.

We have some limits in place now like series limits, but since the series labels have different sizes, probably we might need other ways for limiting.

Limit the data bytes fetched for the whole query or even at each store level could be a way to try.

Describe the solution you'd like

Cortex has a PR https://github.com/cortexproject/cortex/pull/4854 which implements this feature.

douglascamata commented 1 year ago

I think after the work done by @fpetkovski on #6074, this can be closed. WDYT, @yeya24?

yeya24 commented 1 year ago

I see thanks @douglascamata. I think that covers the per store limits. But we still have the per query limits, right? Do we have plan to support that? Also data bytes fetched is another limit for number of limits fetched, different from number of series and chunks limit.

fpetkovski commented 1 year ago

But we still have the per query limits, right?

You are right, the querier will apply the limits only when it acts as a store through gRPC, but not when it executes queries. For that we might need to add limiters to the store proxy.

Could extending the existing wrapper work for adding a bytes limit too?

douglascamata commented 1 year ago

For that we might need to add limiters to the store proxy.

What if we wrap the store proxy with the instrumented store server wrapper? 🤔

Could extending the existing wrapper work for adding a bytes limit too?

I think we could add this one to the instrumented store server and get it for free everywhere. 🙌