opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.51k stars 1.75k forks source link

[Meta] Plugin sandboxing: Step towards modular architecture in OpenSearch #1422

Open saratvemulapalli opened 2 years ago

saratvemulapalli commented 2 years ago

Problem

Plugin architecture enables extending core features of OpenSearch. There are various kinds of plugins which are supported. But, the architecture has significant problems for OpenSearch customers. Mainly, plugins can fatally impact the cluster e.g critical workloads like ingestion/search traffic would be impacted because of a non-critical plugin like s3-repository failed with an exception. The problem multiplies exponentially when we would like to run an arbitrary plugin as OpenSearch core and system resources are not protected well enough.

Zooming in technically, Plugins run with-in the same process as OpenSearch. As OpenSearch process is bootstrapping, it initializes PluginService.java via Node.java. All plugins are classloaded via loadPlugin during the bootstrap of PluginService. It looks for plugins directory and loads the classpath where all the plugin jar and its dependencies are already present. During the bootstrap, each plugin is initialized and they do have various interfaces through which they could choose to subscribe to state changes within the cluster e.g ClusterService.java.

Resources on the system for Plugins in OpenSearch are managed via Java Security Manager. It is initialized during the bootstrap of OpenSearch process. Each plugin defines a security.policy file e.g Anomaly Detection Plugin

As we can see, plugins are loaded into OpenSearch process which fundamentally needs to change.

Objective

This feature enables any plugin to run safely without impacting the cluster and the system.

Design

PLEASE NOTE: THIS DOCUMENT IS WORK IN PROGRESS AND DOES NOT REPRESENT THE FINAL DESIGN.

Plugins Sandboxing_today

Requirements

TBD (Define what we would like to accomplish and whats not changing in the system).

The high level thoughts for plugin sandboxing is basically trying to isolate plugin interactions with OpenSearch. All the interactions for plugins are via extension points. If we can modularize these extension points, I believe we can achieve isolation for plugins.

Proposal

Plugin Sandboxing New World

Plugins run with the OpenSearch process today. We are proposing running plugins through (thanks to dblock@):

We see value in offering an option to run the plugin in different parts of the system. Some plugins would like run within the process (like searching, indexing), in an independent process (like snapshot repository) and on a remote node (like machine learning).

We will build a new Plugins Orchestrator which will facilitate running plugins in all 3 ways. New interfaces will be defined to establish communication between extension and OpenSearch.

Proof of Concept

To explore this idea more, we would like to have a plugin running in an independent process.

Tracking Issues

Learn and Share:

Milestones:

Meta: https://github.com/opensearch-project/OpenSearch/issues/1632

Back Burner:

FAQ

We are exploring to use a light weight form of Transport which will help bi-directional communication. Transport is the communication mechanism OpenSearch uses between nodes.

AD Extension with create detector functionality latency: https://github.com/opensearch-project/opensearch-sdk-java/issues/24#issuecomment-1309547639 AD plugin latency: https://github.com/opensearch-project/opensearch-sdk-java/issues/24#issuecomment-1309588329 With an example extension point onIndicesModule(), we see about 8-11% depending on workload and the throughput decrease is between .05%-7%.

https://github.com/opensearch-project/OpenSearch/issues/2231

3012

  • Would the extensions framework offer both methods for extensions i.e. same process and another process as alternatives

Now as we have the numbers for latency, we see there is value running plugins in process and we will continue to support it for critical workloads in the cycle of querying, indexing.

We are working towards OpenSearch 3.0 to have the initial framework to support extensions and release the first version of SDK support default extension points.

We are working on anomaly detector backend plugin as prototype and run it as an extension. https://github.com/opensearch-project/OpenSearch/issues/5224

pjfitzgibbons commented 2 years ago

Could you tell us - what is the plan for "updating" a plugin? Assuming plugin-X is not fully complete on first release, how can users of the plugin update that plugin on their own instance of Core and Dashboard?

dblock commented 2 years ago

@pjfitzgibbons Extensions will work like in VSCode or any other sane system, where they will declare a minimum (and sometimes a max) version of OpenSearch required. Then you'll be able to upgrade them to a newer release assuming it's compatible with your current version of OpenSearch at runtime, without restarting a cluster. Does this answer your question?

pjfitzgibbons commented 2 years ago

@dblock Yes, understood. Is there a specific task above that you believe implicitly includes upgrading functionality (or version detection or ... ?)

owaiskazi19 commented 2 years ago

@dblock Yes, understood. Is there a specific task above that you believe implicitly includes upgrading functionality (or version detection or ... ?)

Hey @pjfitzgibbons! You can find more details on API Versioning here: https://github.com/opensearch-project/OpenSearch/issues/2447