opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.16k stars 1.69k forks source link

[RFC] Query Planning and Rewriting #12390

Open jainankitk opened 5 months ago

jainankitk commented 5 months ago

Problem statement

In traditional database systems, query planner and optimizer chooses the right indices and join order to ensure efficient execution of query. Opensearch does not have any such component causing degraded performance for queries not written well. The profiler can provide insights into the latency breakdown of each phase, but does not automatically optimize the queries for customers. The impact of manually rewriting the query has been confirmed both by github issues and during managed service customer engagements:

High Level Approach

There are primarily two approaches for solving this problem. One approach relies on cost estimation for planning the query similar to traditional database systems and other is more dynamic in nature by collecting feedback from the performance of rewritten queries and accordingly, enabling/disabling and tuning the parameters for query rewriting.

Query Cost Estimation Based

The key aspect of this approach is the query cost estimation component. Given any query, it is able to guess the query cost well allowing the query rewriter to compare multiple query plans. Accordingly, it can choose the most efficient form for query execution.

Query_Estimation drawio

Pros:

Cons:

Rewritten Query Execution Based

Query_Rewriting drawio

This approach starts with the query rewriting in shadow mode. For every query, it checks if the query is rewrite eligible and samples such queries, executing them asynchronously (shadow mode) and comparing the execution time for the original vs rewritten query. Along with the execution time, every rewrite logic emits the value of tunable rewrite parameters for the query rewrites. Taking date histogram and doc values rewrite as example, we can expect following data after few executions:

Date Histogram Parameters
Rewrite efficiency:
  • directly proportional to document hit count
  • inversely proportional to bucket count

Original Time
Rewrite Time Bucket Count Hit Count
500 50 20 1M
300 30 50 500k
40 60 500 500
20 40 1000 100
Doc Values Rewrite Parameters
Rewrite efficiency:
  • directly proportional to total fields indexed
  • inversely proportional to field requested

Original Time
Rewrite Time Fields Requested Total Fields Indexed
500 50 3 1000
300 30 1 500
350 500 70 100
250 400 150 200


Using the above data, query insights plugin will be able to help detect the right parameter values for each query rewrite type. And once it has sufficient confidence, it can operate in reverse shadow mode where the original query is run occasionally to detect any changes in workload for that particular type of rewrite.

Pros:


Cons:

Mitigations:

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

anirudha commented 5 months ago

review step 4 : https://github.com/opensearch-project/sql/blob/main/docs/dev/intro-architecture.md

peternied commented 5 months ago

[Triage - attendees 1 2 3 4 5] @jainankitk Thanks for filing this rfc

kkmr commented 5 months ago

Most existing query engines don't run optimized queries in shadow mode - They optimize the plan and execute it. I would consider requesting customers to opt-in when we launch the feature initially. Later, once we have tuned the system, we can make query planning the default and let customers turn it off.

kkmr commented 5 months ago

How Good are Query Optimizers, Really? (https://www.vldb.org/pvldb/vol9/p204-leis.pdf) might be relevant reading.

jainankitk commented 5 months ago

How Good are Query Optimizers, Really? (https://www.vldb.org/pvldb/vol9/p204-leis.pdf) might be relevant reading.

Thanks @kkmr for sharing this. I went through this paper and this is mostly around cardinality estimation and join order both of which are not directly relevant for Opensearch. Few observations from this publication:

Most existing query engines don't run optimized queries in shadow mode - They optimize the plan and execute it. I would consider requesting customers to opt-in when we launch the feature initially.

I am also leaning towards introducing query rewriting as feature which can be tuned or turned off using cluster setting. This will allow the front-loading of core value proposition, and the framework for tuning those settings/parameters can be worked upon in later milestones.

Later, once we have tuned the system, we can make query planning the default and let customers turn it off.

While I hope, this would be eventually possible, I feel that tuning for query planning will be continuous exercise for different types of customer workloads