grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
3.9k stars 484 forks source link

Merge identical queries in the scheduling queue #3791

Open bboreham opened 1 year ago

bboreham commented 1 year ago

Is your feature request related to a problem? Please describe.

Currently, if we receive two or more identical queries, we do all the same work for each of them. This might sound rare, but gets more likely as more people in a company are looking at the same dashboard.

Describe the solution you'd like

If we detect two identical queries going in to the scheduling queue we could merge them and just do the work once.

It's possible that we can fetch most of the result from cache, but many requests are not cached and we don't cache data newer than 10 minutes so queries up to "now" will involve work.

(Also applies to series requests, labels, label values, etc.)

Describe alternatives you've considered

Leave it as-is.

Additional context

We have something like this in store-gateway with the expandedPostingsPromise.

Credit @pracucci who mentioned this idea to me yesterday.

pstibrany commented 1 year ago

I like the idea, just adding few notes:

pracucci commented 1 year ago

"if we receive two or more identical queries" -- do you mean identical start/end times too? I guess that would lower chances of finding identical queries.

Range queries are aligned by Grafana (to make query results cachable too). I think this idea could still be effective to cover the case many users keep auto-refreshing the same dashboard.

dimitarvdimitrov commented 1 month ago

this duplicate issue has some details on caching and consistent routing of queries to schedulers https://github.com/grafana/mimir/issues/6642