[RFC] Query Sandboxing for search requests

kaushalmahi12 commented 11 months ago

Co-author - jainankitk Parent RFC - https://github.com/opensearch-project/OpenSearch/issues/8879

Introduction

In most of the information retrieval systems it is very common to see the performance impact on few tenants of the system caused by other tenants. As few tenants can take significant amount of system resources leaving the others deprived. Hence it becomes critically important for a IR system to minimize the performance impact for all tenants. What could be a better way than providing the user to define and configure the limits and priority for the tenants of the system. (Node drops)/(Poor performance) due to bad behaving tenant queries are one of the common pain in the OpenSearch clusters. Creating the tenant based performance isolation in OpenSearch becomes critically important improve the resiliency and stability of the cluster with Cx value as bonus.

Tenant here I am assuming an User or Index but not limited to these

Problem

In the current OpenSearch there is no mechanism for tenant based performance isolation for search workloads. We want to enable the admin users of OpenSearch cluster to manage tenant based Sandboxes to enforce resource based limits on the tenant queries. Each Sandbox will have a priority which will determine the cancellation order in node duress situations.

Scope

Since we want to partition the resources amongst the tenants on a node, It makes more sense to confine this feature to node level so that node level resiliency is achieved.

Use cases

User/s based performance isolation
Index based resource usage enforcement for search workload e,g; hot and warm.
skewed search traffic throttling, i,e; if only one of the indices is getting most of the queries on a node and causing other search requests to be either throttled/rejected, we can avoid it by confining the same index queries to a sandbox.

Proposal

We are proposing to introduce a reactive mechanism to actively track the resources for tenants and cancel in case of oversubscription of system resources for tenant/s. This will help us in identifying and cancelling the rogue queries reactively and help us maintain the node stability.

As part of this proposal we will introduce new software constructs called Sandbox which will be attribute based and admin users of OpenSearch cluster can manage (CRUD ability) at node level. . The attributes we are selecting will be generic across all users(Domains/cluster). For the System resources we will track the jvmAllocations(due to jdk api limitation for thread level current jvm usage) and cpuUtilisation . Other system resources like network IO, Disk IO we are not considering because of multiple reasons

No JAVA api for getting these per thread
Though we can get it from /proc but this data is loaded from kernel data structures which stores this info in binary. To access this information per thread stats for IO is 3 sys calls(open, read, close),

A Sandbox will track the resources for all the requests associated with it and will try to enforce the resource usage limits per sandbox. We will cancel the queries from low priority sandboxes in node duress scenarios. Since tracking the resource usage could be an overhead for too many sandboxes in the system, we can limit this with cluster level setting to enforce node level count of sandboxes.

We are planning to start with reactive mechanism i,e; track and cancel in case of contention or threshold breaches. But going forward we want to build a robust search query cost estimation framework to cancel majority of the search queries upfront.

Future Improvements

Hard cancellation - Since this feature will also be dependent on hard cancellation to be more effective. We will need hard cancellation for making this highly effective as max this feature can do is hint towards cancellation.
Search Query Cost Estimation - This component will help us estimate the resource usage for search queries which can help in rejecting search queries upfront based on estimated cost based framework.
Async Completion of cancellable queries - We can punt these queries to async queries specific sandbox which can complete at some later point in time.

getsaurabh02 commented 11 months ago

Thanks @kaushalmahi12 for proposing this. This is going to be super useful for maintaining the resiliency of clusters, especially the large ones with multiple tenants or users.

I like the idea of constructs called Sandbox which will be attribute based. However, instead of making it generic across all users, can we introduce a concept of user-account/tenant-id which admins can use to define/configure and have multiple sandboxing configurations attached to them. That way queries when passes with those user-account/tenant-id automatically maps to one of the sandboxing configuration and node-limits are applied dynamically on it.

It will then allow system/cluster admins to create/maintain multiple sandboxing configurations and map them to the group of internal users (or tenants). This can then further be integrated with the Security Plugin in near future to associate these ids with user-roles and cluster permissions, making it a more concrete construct. Thoughts?

This will also provide common extension points for associating these user-account/tenant-id with Top-N queries which is focussed on the extended visibility of individual queries. While allowing admins to be able to map these rogue/slow queries back to users. .

reta commented 11 months ago

I think this idea was discussed in https://github.com/opensearch-project/OpenSearch/issues/8879

kaushalmahi12 commented 11 months ago

@getsaurabh02 Thanks for your suggestions!

However, instead of making it generic across all users, can we introduce a concept of user-account/tenant-id which admins can use to define/configure and have multiple sandboxing configurations attached to them.

What I meant to convey is that these attributes should be available across all OpenSearch clusters since these attributes will be part of either authN/authZ or part of request attribute. Now within the cluster we can always define new entities and associate them to sandboxes. Does this make sense or I am missing something ?

It will then allow system/cluster admins to create/maintain multiple sandboxing configurations and map them to the group of internal users (or tenants). This can then further be integrated with the Security Plugin in near future to associate these ids with user-roles and cluster permissions, making it a more concrete construct. Thoughts?

I agree with it. Since the feature inherently limits the access to resources, Security Plugin can help provide more concrete and robust mechanism for resource access.

ansjcy commented 8 months ago

@kaushalmahi12 Thanks for the proposal! As .getsaurabh02 mentioned, I think with the combination of Query Sanboxing and Top N Queries, OpenSearch admin users can potentially have both better control and better visibility into the rogue queries by tenents.

Regarding getting the resource usage data from /proc, the performance analyzer plugin already have that data, have explored if it is possible to associate these data with specific queries?

opensearch-project / OpenSearch