[RFC] Query Planning and Rewriting

jainankitk commented 5 months ago

Problem statement

In traditional database systems, query planner and optimizer chooses the right indices and join order to ensure efficient execution of query. Opensearch does not have any such component causing degraded performance for queries not written well. The profiler can provide insights into the latency breakdown of each phase, but does not automatically optimize the queries for customers. The impact of manually rewriting the query has been confirmed both by github issues and during managed service customer engagements:

Github Issues:
- Date histogram performance can significantly benefit by rewriting the aggregation into range query filter wherever possible (Github issue - https://github.com/opensearch-project/OpenSearch/issues/9310)
- Dynamic pruning for cardinality aggregation - https://github.com/opensearch-project/OpenSearch/issues/11959
Other rewrites:
- Stored fields/_source to docvalue_fields - This optimization has resulted in significant performance improvement for some of the managed service customers
- Map/Global Ordinals execution hint - One of the managed service customer was able to achieve more consistent performance by providing map hint instead of default global ordinals

High Level Approach

There are primarily two approaches for solving this problem. One approach relies on cost estimation for planning the query similar to traditional database systems and other is more dynamic in nature by collecting feedback from the performance of rewritten queries and accordingly, enabling/disabling and tuning the parameters for query rewriting.

Query Cost Estimation Based

The key aspect of this approach is the query cost estimation component. Given any query, it is able to guess the query cost well allowing the query rewriter to compare multiple query plans. Accordingly, it can choose the most efficient form for query execution.

Query_Estimation drawio

Pros:

Low cost, insignificant overhead. Need not worry about workload changes
Handles >1 rewrite for single query easily, by generating multiple plans with those rewrites

Cons:

Estimation process is complex due to sheer number of cases, might rely on clause execution time feedback
Hard to come up with the accurate estimates

Rewritten Query Execution Based

Query_Rewriting drawio

This approach starts with the query rewriting in shadow mode. For every query, it checks if the query is rewrite eligible and samples such queries, executing them asynchronously (shadow mode) and comparing the execution time for the original vs rewritten query. Along with the execution time, every rewrite logic emits the value of tunable rewrite parameters for the query rewrites. Taking date histogram and doc values rewrite as example, we can expect following data after few executions:

Date Histogram Parameters Rewrite efficiency: directly proportional to document hit count inversely proportional to bucket count Original Time	Rewrite Time	Bucket Count	Hit Count
500	50	20	1M
300	30	50	500k
40	60	500	500
20	40	1000	100

Doc Values Rewrite Parameters Rewrite efficiency: directly proportional to total fields indexed inversely proportional to field requested Original Time	Rewrite Time	Fields Requested	Total Fields Indexed
500	50	3	1000
300	30	1	500
350	500	70	100
250	400	150	200

Using the above data, query insights plugin will be able to help detect the right parameter values for each query rewrite type. And once it has sufficient confidence, it can operate in reverse shadow mode where the original query is run occasionally to detect any changes in workload for that particular type of rewrite.

Pros:

Makes query rewriting pluggable as and when new rewrites are identified. Each rewrite needs to plugin itself by providing the list of parameters on which it can be tuned
Actually running the query

Cons:

Cannot be done for every query to limit the overhead, relies on sampling
Need to handle caching for getting right execution times
Difficult to handle >1 rewrite for single query easily, as every rewrite has cost
Need to run even original query to detect any workload changes
Additional resource consumption for async execution of rewritten / original query

Mitigations:

Reduce the subset by limiting to expensive queries > 100/200ms
Try the query rewrite on few shards for limiting performance impact
Threshold the query execution time based on actual/rewritten query, maybe 2x (Query Sandboxing / Hard Cancellation)

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

anirudha commented 5 months ago

review step 4 : https://github.com/opensearch-project/sql/blob/main/docs/dev/intro-architecture.md

peternied commented 5 months ago

[Triage - attendees 1 2 3 4 5] @jainankitk Thanks for filing this rfc

kkmr commented 5 months ago

Most existing query engines don't run optimized queries in shadow mode - They optimize the plan and execute it. I would consider requesting customers to opt-in when we launch the feature initially. Later, once we have tuned the system, we can make query planning the default and let customers turn it off.

kkmr commented 5 months ago

How Good are Query Optimizers, Really? (https://www.vldb.org/pvldb/vol9/p204-leis.pdf) might be relevant reading.

jainankitk commented 5 months ago

How Good are Query Optimizers, Really? (https://www.vldb.org/pvldb/vol9/p204-leis.pdf) might be relevant reading.

Thanks @kkmr for sharing this. I went through this paper and this is mostly around cardinality estimation and join order both of which are not directly relevant for Opensearch. Few observations from this publication:

Real vs benchmark data can have very different characteristics as the uniformity and correlations can be much more different. While the estimation is pretty accurate for TPC-H generated data, not the case for JOB (IMDB) data
Cost estimation is hard problem, but improvement can be significant as called out in Section 5.4 when the estimates were replaced by actual cardinalities for planning the join

Most existing query engines don't run optimized queries in shadow mode - They optimize the plan and execute it. I would consider requesting customers to opt-in when we launch the feature initially.

I am also leaning towards introducing query rewriting as feature which can be tuned or turned off using cluster setting. This will allow the front-loading of core value proposition, and the framework for tuning those settings/parameters can be worked upon in later milestones.

Later, once we have tuned the system, we can make query planning the default and let customers turn it off.

While I hope, this would be eventually possible, I feel that tuning for query planning will be continuous exercise for different types of customer workloads

opensearch-project / OpenSearch