opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.61k stars 1.76k forks source link

[RFC] Query Sandboxing for search requests #11061

Open kaushalmahi12 opened 11 months ago

kaushalmahi12 commented 11 months ago

Co-author - jainankitk Parent RFC - https://github.com/opensearch-project/OpenSearch/issues/8879

Introduction

In most of the information retrieval systems it is very common to see the performance impact on few tenants of the system caused by other tenants. As few tenants can take significant amount of system resources leaving the others deprived. Hence it becomes critically important for a IR system to minimize the performance impact for all tenants. What could be a better way than providing the user to define and configure the limits and priority for the tenants of the system. (Node drops)/(Poor performance) due to bad behaving tenant queries are one of the common pain in the OpenSearch clusters. Creating the tenant based performance isolation in OpenSearch becomes critically important improve the resiliency and stability of the cluster with Cx value as bonus.

Tenant here I am assuming an User or Index but not limited to these

Problem

In the current OpenSearch there is no mechanism for tenant based performance isolation for search workloads. We want to enable the admin users of OpenSearch cluster to manage tenant based Sandboxes to enforce resource based limits on the tenant queries. Each Sandbox will have a priority which will determine the cancellation order in node duress situations.

Scope

Since we want to partition the resources amongst the tenants on a node, It makes more sense to confine this feature to node level so that node level resiliency is achieved.

Use cases

Proposal

We are proposing to introduce a reactive mechanism to actively track the resources for tenants and cancel in case of oversubscription of system resources for tenant/s. This will help us in identifying and cancelling the rogue queries reactively and help us maintain the node stability.

As part of this proposal we will introduce new software constructs called Sandbox which will be attribute based and admin users of OpenSearch cluster can manage (CRUD ability) at node level. . The attributes we are selecting will be generic across all users(Domains/cluster). For the System resources we will track the jvmAllocations(due to jdk api limitation for thread level current jvm usage) and cpuUtilisation . Other system resources like network IO, Disk IO we are not considering because of multiple reasons

A Sandbox will track the resources for all the requests associated with it and will try to enforce the resource usage limits per sandbox. We will cancel the queries from low priority sandboxes in node duress scenarios. Since tracking the resource usage could be an overhead for too many sandboxes in the system, we can limit this with cluster level setting to enforce node level count of sandboxes.

We are planning to start with reactive mechanism i,e; track and cancel in case of contention or threshold breaches. But going forward we want to build a robust search query cost estimation framework to cancel majority of the search queries upfront.

Future Improvements

getsaurabh02 commented 11 months ago

Thanks @kaushalmahi12 for proposing this. This is going to be super useful for maintaining the resiliency of clusters, especially the large ones with multiple tenants or users.

I like the idea of constructs called Sandbox which will be attribute based. However, instead of making it generic across all users, can we introduce a concept of user-account/tenant-id which admins can use to define/configure and have multiple sandboxing configurations attached to them. That way queries when passes with those user-account/tenant-id automatically maps to one of the sandboxing configuration and node-limits are applied dynamically on it.

It will then allow system/cluster admins to create/maintain multiple sandboxing configurations and map them to the group of internal users (or tenants). This can then further be integrated with the Security Plugin in near future to associate these ids with user-roles and cluster permissions, making it a more concrete construct. Thoughts?

This will also provide common extension points for associating these user-account/tenant-id with Top-N queries which is focussed on the extended visibility of individual queries. While allowing admins to be able to map these rogue/slow queries back to users. .

reta commented 11 months ago

I think this idea was discussed in https://github.com/opensearch-project/OpenSearch/issues/8879

kaushalmahi12 commented 11 months ago

@getsaurabh02 Thanks for your suggestions!

However, instead of making it generic across all users, can we introduce a concept of user-account/tenant-id which admins can use to define/configure and have multiple sandboxing configurations attached to them.

What I meant to convey is that these attributes should be available across all OpenSearch clusters since these attributes will be part of either authN/authZ or part of request attribute. Now within the cluster we can always define new entities and associate them to sandboxes. Does this make sense or I am missing something ?

It will then allow system/cluster admins to create/maintain multiple sandboxing configurations and map them to the group of internal users (or tenants). This can then further be integrated with the Security Plugin in near future to associate these ids with user-roles and cluster permissions, making it a more concrete construct. Thoughts?

I agree with it. Since the feature inherently limits the access to resources, Security Plugin can help provide more concrete and robust mechanism for resource access.

ansjcy commented 8 months ago

@kaushalmahi12 Thanks for the proposal! As .getsaurabh02 mentioned, I think with the combination of Query Sanboxing and Top N Queries, OpenSearch admin users can potentially have both better control and better visibility into the rogue queries by tenents.

Regarding getting the resource usage data from /proc, the performance analyzer plugin already have that data, have explored if it is possible to associate these data with specific queries?