Open m-yosefpor opened 10 months ago
what should be the default value you want to be set of the flag query-range.timeout
?
what should be the default value you want to be set of the flag
query-range.timeout
?
As query.timeout has a 2m
default interval, probably we need a larger default timeout for query-range.timeout
. Maybe 5m
would be a good default value, and then people can start to tune the flag for their usecases. (we need a much lower timeout for our usecase, e.g. 30s
however we have set query.timeout
flag in querier to 10s
)
You can do this via having a gateway such as envoy or ambassador in front of Query Frontend. You can enforce the timeout there and Query frontend queries will cancel context when client timeout (context canceled)
Can I give this a try?
Can I give this a try?
Hello @kartikaysaxena I am currently working on this and raised a PR too , will pass it on to you if it doesn't works.
Is your proposal related to a problem?
The current behavior of the Thanos query frontend poses challenges when dealing with horizontal sharding using the
query-range.split-interval
parameter. Specifically, thequery.timeout
flag in thequery
subcommand is applied for each split interval, rather than enforcing a global timeout for the entire range query. This results in the query frontend continuing to process individual chunks for a long time, leading to extended query processing times that can impact system performance, resource exhaustion and user experience. Currently thanos supports the slow query logging feature (query-frontend.log-queries-longer-than
) to include the ability to abort queries based on a predefined threshold. While this approach can provide visibility into long-running queries, it may not prevent them from consuming resources and affecting system performance.Describe the solution you'd like
To address this issue, I propose introducing a new flag,
query-range.timeout
, specifically for the query frontend. This flag would allow users to set a global timeout for range queries, ensuring that the frontend aborts requests that exceed this duration. By setting aquery-range.timeout
, users can prevent range queries from continuing indefinitely, even if individual split intervals are completed within the specifiedquery.timeout
. (like whatquery-frontend.log-queries-longer-than
calculates and logs)