opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.87k stars 1.84k forks source link

[RFC] High Level Vision for Core Search in OpenSearch #8879

Open jainankitk opened 1 year ago

jainankitk commented 1 year ago

Introduction

This issue proposes high level vision for core search within Opensearch. While the initial common themes are identified from customer feedback for Amazon Opensearch Service, any comments about the overall direction are welcome for making it applicable to more diverse use cases

Background

Search component forms the core of applications based on Opensearch. The ability to search data efficiently is critical enough to influence the data storage format during indexing keeping search first. Compared to traditional search engines, Opensearch supports insightful aggregations in addition to filtering allowing customers to do much more and visualize their data better

Common Themes

Proposal

I am capturing below some of the projects that can be prioritized or are being worked upon in other RFCs. Feel free to link any other ideas / projects already in flight for Core Search

Resiliency - The cluster can go down even with single query today, which should not be allowed. Recovery from node failures in OpenSearch is an expensive process and can lead to cascading failure. Hence, it makes sense to cancel/kill aggressively and what not to ensure the node stays up. Search back pressure is good first step in direction of making it more robust. The key component is the lightweight resource tracking that can be leveraged to do:

Performance - We need to look at ways of inherently improving the query performance. This becomes even more critical with the decoupling of storage and compute, remote storage performance is nowhere close to the hot counterpart

Visibility - Customers have very limited to no visibility into query execution today. Root causing issues related to query latencies are very difficult and time consuming

New Features

Next Steps We will incorporate the feedback from this RFC into more detailed proposal / high level design for different themes. We will then create meta issues to go in more depth for the detailed design

Bukhtawar commented 1 year ago

Thanks @jainankitk sounds exciting. Some items like Query prioritisation and scheduling #1017 are themselves pending prioritisation 😎

jainankitk commented 1 year ago

Thanks @Bukhtawar for sharing that issue. Linking the issue in overview.

Some items like Query prioritisation and scheduling are themselves pending prioritisation 😎

Created this issue for holistic view of ideas in search space (any existing or new issue) and prioritize accordingly

dblock commented 1 year ago

This is quite interesting. I think we may be missing Query Planning, which may change how we do, for example, sandboxing, prioritization and scheduling.

andrross commented 1 year ago

Thanks @jainankitk! What about cost as a theme? A common problem users run into is that it can be quite expensive to have a large dataset that is searchable even if it is not actively being searched at a high rate. I know this bleeds more into separating compute and storage but it might be worth a mention in this holistic view as there might be multiple ways to attack the problem.

sohami commented 1 year ago

Thanks for putting this together.

I think we may be missing Query Planning, which may change how we do, for example, sandboxing, prioritization and scheduling.

+1

We have been discussing sandboxing for restricting the resource foot print in context of single query, but sandboxing makes more sense at bucket level. We can define per bucket thresholds configurable as a cluster setting to preemptively kill a query from the bucket for which threshold is hit.

To achieve bucket level sandboxing I think we will still need to have query level sandboxing in place such that each query in the bucket plays well with the resources given to it. With reactive cancellation mechanism we will still run into similar problems as today where oversubscription of the resources happen in the concurrent environment and triggering cancellation may be too late. One can think of the CircuitBreakers present today as a catch all bucket

kiranprakash154 commented 1 year ago

I created an RFC for Disk based caching - https://github.com/opensearch-project/OpenSearch/issues/9001

jainankitk commented 1 year ago

I think we may be missing Query Planning, which may change how we do, for example, sandboxing, prioritization and scheduling.

Good catch! The planner, in addition to providing much needed visibility, helps us make important decisions across resiliency and performance

jainankitk commented 1 year ago

What about cost as a theme?

As you rightfully mentioned, it makes more sense for compute/storage separation.

there might be multiple ways to attack the problem.

Do you have any specific problem in mind from search perspective? Maybe, improving the ratio of searchable data per compute/memory unit?

jainankitk commented 1 year ago

To achieve bucket level sandboxing I think we will still need to have query level sandboxing in place such that each query in the bucket plays well with the resources given to it.

IMO bucket level sandboxing is more generic version allowing us not only tackle query level sandboxing (every query is unique bucket), it also allows for managing resources across multiple tenants on single cluster.

With reactive cancellation mechanism we will still run into similar problems as today where oversubscription of the resources happen in the concurrent environment and triggering cancellation may be too late. One can think of the CircuitBreakers present today as a catch all bucket

The reactive mechanism today has tricky decision making component. For query sandboxing, it is simpler as what to cancel is quite clear. If we can harden the cancellation mechanism, reactive should work pretty well. Although, there are interesting scenarios where you want to serialize the query execution instead of parallelization to stay within the resource constraint bounds. For example - instead of picking 2 shards in parallel on node, do them one after the other to stay within resource bounds.

sohami commented 1 year ago

To achieve bucket level sandboxing I think we will still need to have query level sandboxing in place such that each query in the bucket plays well with the resources given to it.

IMO bucket level sandboxing is more generic version allowing us not only tackle query level sandboxing (every query is unique bucket), it also allows for managing resources across multiple tenants on single cluster.

I am visualizing buckets as a group where multiple queries can execute. If you are considering 1 bucket per query than we are talking about same thing to enforce the sandboxing essentially at query level. What I was calling out, we need to have mechanism to sandbox per query within a bucket where a bucket is a group (which I think how buckets will be used ?) instead of 1:1 mapping with a query . Like Bucket_1 (IT dept), Bucket_2 (Security dept). I want to split the buckets such that I allow resource split of 40/60. So each bucket need to ensure that when multiple queries are running in each bucket they remain under the resource constraint of each bucket.

The reactive mechanism today has tricky decision making component. For query sandboxing, it is simpler as what to cancel is quite clear..

My thought was we will basically use similar mechanism but instead of node level it will be done at bucket level. But may be will wait to see the details on this to understand better :)

If we can harden the cancellation mechanism, reactive should work pretty well

Yes, but should we always fail the request as part of sandboxing ? Can be a first step towards it, but we should also think around how we can let the query run with constrained resource and complete may be taking more time but not necessarily failing it. Reactive is one approach other can be based on cost based planning.

jainankitk commented 1 year ago

I am visualizing buckets as a group where multiple queries can execute. If you are considering 1 bucket per query than we are talking about same thing to enforce the sandboxing essentially at query level. What I was calling out, we need to have mechanism to sandbox per query within a bucket where a bucket is a group (which I think how buckets will be used ?) instead of 1:1 mapping with a query . Like Bucket_1 (IT dept), Bucket_2 (Security dept). I want to split the buckets such that I allow resource split of 40/60. So each bucket need to ensure that when multiple queries are running in each bucket they remain under the resource constraint of each bucket.

Your understanding of the bucket is correct, I was just saying that bucketing is more generic and can be used for limiting individual queries as well. The bucket level decision is slightly simpler due to two reasons 1/ The value is configured by the customer, so we don't need to determine whether node is under duress 2/ Individual bucket should have more uniformity (due to which they should be bucketed together in the first place)

Yes, but should we always fail the request as part of sandboxing ? Can be a first step towards it, but we should also think around how we can let the query run with constrained resource and complete may be taking more time but not necessarily failing it. Reactive is one approach other can be based on cost based planning.

That's a great point and I also kind of talked about this in previous comment - There are interesting scenarios where you want to serialize the query execution instead of parallelization to stay within the resource constraint bounds. For example - instead of picking 2 shards in parallel on node, do them one after the other to stay within resource bounds. That being said, we will iterate in following manner 1/ Reactive cancellation 2/ Proactive cancellation based on historic latency/cancellation metrics and query cost estimation using planner 3/ Serialize across shards to not exceed resource bounds 4/ Limit the resources within individual shard search execution

ansjcy commented 10 months ago

Thanks @jainankitk for the high level vision! Wanted to add a few points on the visibility front. We are working on a set of Query Insights features to improve the visibility: https://github.com/opensearch-project/OpenSearch/issues/11522

We are working on query categorization, which will provide users OTel metrics on summaries of search workload types. Also as shown in the meta issue, we will also work on the Query Insights plugin with Top N Queries (based on resource usages) and in the future implement the Query Insights Dashboard (Admin Dashboard) to provide even more visibility!

msfroh commented 2 months ago

Can we rename this RFC? It's focused on visibility and resiliency.

We have a lot more feature-oriented RFCs that are on the roadmap for search, and the title suggests that this is the singular vision for search.

smacrakis commented 2 months ago

Agree with Froh. How about renaming it to Performance and Resiliency: Overview?