opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.89k stars 1.84k forks source link

Search Query Runtime Cost Calculation #5174

Open PritLadani opened 2 years ago

PritLadani commented 2 years ago

Is your feature request related to a problem? Please describe.

1179 aims to build resource tracking framework for search queries. As a part of #3982, we have enabled resource tracking for shard level tasks. However, to support search back-pressure and to build a model for query cost estimation as discussed in #1042, we need coordinator level/query level resource consumption stats.

Describe the solution you'd like We will piggyback the shard tasks' resource consumption along with the ClusterSearchShardsResponse from children nodes(data nodes) to the parent node(coordinator node). We need to change the response structure to accommodate the resource stats.

Describe alternatives you've considered Another alternative we have considered is, rather than piggybacking the resource stats at the task completion, we can periodically share the resource stats from data nodes to the coordinator node. However, for query cost calculation, we do not need periodic stats from the data nodes. Moreover, sharing the resource consumption stats periodically will introduce overhead of new service running in the background to collect and share the data to the parent node.

Additional context Just by looking at the resource consumption or aggregating the resource stats of child tasks, we cannot get the estimate of resource consumption of the coordinator task. Hence we cannot estimate whether a search task will cause the node go in duress or not and hence we do not need periodic resource stats from the data nodes.

reta commented 2 years ago

I am not really sure what is being estimated as the "search query cost" here. Based on the description it is deducted as the resource consumption stats, which is post execution of the query. The exactly same query could have drastically different consumption stats over time (fe because new data is being ingested all the time).

What would be useful though is to estimate search query cost before the execution, based on:

Does it make sense or am I missing something here?

dblock commented 2 years ago

I think the proposal is a little unclear on cost estimation vs. query planning, and so @reta is rightfully confused.

I think the purpose of the proposal is to predict the best way possible the "runtime cost" (consumption of time and space) of an incoming query and use it in backpressure. Runtime cost is impacted by the query being made, but it's a lot more impacted by things like the size of data.

So, I propose to explain the goals by calling the ask here as "search query runtime cost" (vs. just search query cost), and calling the non-runtime aspects of a query "query (plan) cost (or complexity)". Does that help?

PritLadani commented 2 years ago

@reta We are not really estimating the query cost here, rather we are just calculating the actual query cost(or call it runtime cost). We are building co-ordinator level view of resource consumption for each search request. As discussed in #1179 and #1181, we want to build the aggregated view of resource consumption stats for any given query. For the same, we want to piggyback the consumption stats to the parent node. However, as a part of this issue, we will not make cancellation decisions yet.

Also, exactly the same query can have different resource consumption stats for different scenarios but as @dblock mentioned, we are trying calculate "runtime cost" for a search query.

@dblock Appreciate your suggestion to change it to "Search Query Runtime Cost". Will update the title.

anasalkouz commented 1 year ago

Hi @PritLadani, are you actively working on this?

PritLadani commented 1 year ago

Hey @anasalkouz , I might not be able to take this up as of now. @kaushalmahi12, are you taking care of this as a part of next milestone of Search Backpressure? @ajaymovva, I remember you were also building some kind of cost calculator for running tasks. Can this task be considered for that?