Open bboreham opened 1 year ago
I like the idea, just adding few notes:
"if we receive two or more identical queries" -- do you mean identical start/end times too? I guess that would lower chances of finding identical queries.
Request sent to query-scheduler (FrontendToScheduler
) has a frontendAddress
and queryID
. These are used by querier to send the result back to frontend. If we merge multiple requests, querier will need to send results to multiple frontends (with different queryID
for each frontend)
Results cache is consulted before request is passed to query-scheduler. Queriers don't use results cache today (but ofc that can be changed)
"if we receive two or more identical queries" -- do you mean identical start/end times too? I guess that would lower chances of finding identical queries.
Range queries are aligned by Grafana (to make query results cachable too). I think this idea could still be effective to cover the case many users keep auto-refreshing the same dashboard.
this duplicate issue has some details on caching and consistent routing of queries to schedulers https://github.com/grafana/mimir/issues/6642
Is your feature request related to a problem? Please describe.
Currently, if we receive two or more identical queries, we do all the same work for each of them. This might sound rare, but gets more likely as more people in a company are looking at the same dashboard.
Describe the solution you'd like
If we detect two identical queries going in to the scheduling queue we could merge them and just do the work once.
It's possible that we can fetch most of the result from cache, but many requests are not cached and we don't cache data newer than 10 minutes so queries up to "now" will involve work.
(Also applies to series requests, labels, label values, etc.)
Describe alternatives you've considered
Leave it as-is.
Additional context
We have something like this in store-gateway with the
expandedPostingsPromise
.Credit @pracucci who mentioned this idea to me yesterday.