Improve Kibana dependency injection system

pgayvallet commented 1 year ago

Problem statement

The current dependency injection system is based on the concept that the only level of granularity is plugins: plugins depend on plugins. We also have two types of dependencies: required (which throw an error if not found) and optional (which would just not inject the dependency).

If that made sense when the DI system was (re)designed, 5 years ago, it has a lot of caveats and limitations with the way developers are working with plugins today.

The biggest limitations and problems we identified are:

1. DI should be on based on services, not on whole plugins

(yes, "services" would be a new concept that doesn't really exist as a first-class citizen in Kibana yet, and we would need to more properly define it, however given it's commonly used in every DI/architecture, you probably get the overall idea)

Having the DI performed at the plugin level (as opposed to a lower level such as services) plays fairly badly with cross/shared plugin functionalities. If a plugin (A) needs to leverage a functionality from another plugin (B) that was already depending on (A), a cyclic dependency occurs, which can only be solved by refactoring the codebase to either 1. merge (A) and (B) or 2. (more common) split either (or both) of (A) and (B) into smaller components to break the cyclic dependency.

To clarify, we still strongly think that cyclic dependencies ARE design flaws. Those should be avoided, and we want a way to detect them before runtime. However, our system should not consider dependencies between distinct and isolated sub-systems as cyclic dependencies.

In other words, this should be allowed:

Screenshot 2023-09-21 at 20 32 45

And not be considered as a cyclic dependency as it is today:

Screenshot 2023-09-22 at 09 40 26

We think that this would be a better granularity level for DI, and would allow plugins to only represent functional blocks (which was what plugins were supposed to be initially), and not be an artificial smaller component driven by the need to avoid cyclic dependency between two functional areas.

Note that the intent is not to allow cyclic dependencies, so things like this would still be forbidden:

Screenshot 2023-09-22 at 10 41 08

2. Dependency injection is not consistent atm

We effectively have two distinct DI systems:

The plugin-to-plugin dependency system
The request handler context ("context providers") DI system

This feels wrong, and ideally dependency injection should be done using the same system regardless of where the injection needs to be performed.

Note that this separation was caused by different factors:

plugin DI is based on injecting setup and start contracts, concept that are not present for request handler DI: handler contexts are actually closer to what "services" would be.
request handler services are also by design scoped to the request (similar to the Request scope of DI systems such as Spring), where plugin DI is based on actual singletons (the contracts for each services are singletons and not scoped to anything - request, user or so on)

3. Dependency injection based on lifecycle was (probably) a bad idea

Atm plugins dependency injection is tightly coupled to their lifecycles.

In the current system, you inject either/both the setup and start contract of the plugins you depend on to the equivalent lifecycle phase of your plugin. This is a fairly uncommon way of doing DI and, if that may work with plugins, it's rather awkward with services (which also makes the system not being usable elsewhere, e.g. for request handler context, or for service injection if we wanted to introduce the concept).

The coreSetup.getStartServices work-around is a perfect demonstration of how the lifecycle-based injection is flawed in its design: There are sub-systems that are created/registered during setup, and yet that will be functioning during start and therefore need access to the start contracts.

Ideally, we would get closer to a more "traditional" DI system, injecting service instances (not lifecycle contract) and attaching to lifecycle via hooks (e.g. some @OnStart decorator at the service level, or some onStart(DIHandler) at the plugin level. (and ideally long term, we should think of fully removing lifecycles altogether).

4. The DI system doesn't have any concept of "scopes"

As said in points 2., atm the two systems don't have any concept of scopes

plugin injection is only allowing the global scope by design
request handler context injection is only allowing the request scope, also by design

Looking ahead of us (yes, I'm thinking of multitenancy), we know that we will likely need a proper scoped DI system, to have per-user/tenant service instances.

Adapting the current system to support scoping capabilities doesn't seem feasible.

Requirements of the new system

These are the features and capabilities that were identified as mandatory for the new DI system:

Supporting all DI needs across the platform

The replacement DI should be able to replace / supersede efficiently the two parts of the current DI system:

The plugin-to-plugin contract injection system
The request handler context (which serves as a per-request scoped system)

Re-usable

The DI system should be usable directly for teams to serve their own injection context needs (which is what the context handler DI was supposed to provide, even if in the end, the injector isn't even directly exposed to plugins anymore)

Lazy dependency creation

We can't afford, performance-wise, to instantiate everything eagerly, especially for request handler dependencies injection / low level scopes such as user or request.

The DI system should support lazy load of dependencies, one way or another.

Service scoping

To prepare for multitenancy, we need to be able to leverage the concept of scopes for our injection contexts.

global

The service is globally scoped. It's a singleton and the only instance that will always be used when the dependency is requested.

tenant

The service is scoped to a given tenant. A different instance will be used for each tenant. An example of a tenant-scoped service could be the internal ES client (given each tenant connects to a different ES cluster)

user

The service is scoped to a given user (and implicitly to a given tenant). A different instance will be used for each user. An example of a user-scoped service could be the scoped ES client (given each client connects with their own credentials)

request

The service is scoped to a given request (and implicitly to a given user and tenant). Not sure how useful it would be tbh. Maybe not necessary if we have tenant and user? Are there really scenarios where we would need per-request service creation/disposal?

note: on the client side, all services would be using the global scope given that the browser-side code is single user/tenant by nature

Garbage collection

Necessary because of the dynamic scopes (user/tenant/request). The system must have a mechanism to dispose of unused dependency graphs (e.g. dispose of user scoped services after a given amount of time since last use, something LRU-based or similar)

Service visibility

(optional probably) Ideally, we should also be able to switch the visibility of each service.

plugin

Only injectable from within the plugin that defined the service. Useful for internal services, e.g. ones only used by some of the plugin's request handlers.

pluginGroup

(Note: this would be a new concept - not sure if really necessary) Only injectable from within plugins of the same plugin group than the plugin that defined the service. Useful to only share services across same-domain plugins (but again, with a better DI system solving the unnecessary cyclic dependencies, do we need it?)

global

(Likely the default) Injectable regardless of the context

Dynamic DI context creation

For background task executions (and any other code execution not directly attached to a request handler), we need to be able to create the DI context to use for a given code execution. This would be necessary to run tasks on behalf of a given user and/or tenant for example.

A (naive) example of the need would be:

runTask(user, tenant) { 
   // create the DI context attached to the provided user and tenant (note: in practice, the tenant would probably be inferred from the user)
   const context = core.di.createContext({ user, tenant });
   // retrieve the esClient service scoped to the created context
   const esClient = core.di.asScoped(context).getService('esClient');
   // ... do something with the scoped ES client
}

Easy implementation switch

Ideally, we should prepare to the possibility of splitting the Kibana monolith into a smaller service. For this, we think it would be great if we could think of the possibility to easily switch from local code execution to remote API calls.

We're not thinking of automagically wrapping dependencies around remote call wrappers (we will likely never be able to do this given the server-side runtime we're using), but just to be able to switch one service's implementation easily in the system.

E.g it would be great if the system could help us easily switch from

This on prem:

Screenshot 2023-09-22 at 10 31 16

To this for serverless:

Screenshot 2023-09-22 at 10 31 22

Work well (enough) with the current system

The transition will take time (years). We need a way for the two systems to cohabitate.

As we did for the KP migration, we could have it in a way that the old DI can see the new DI service but not the other way around.

We need to discuss more about this point once we're aligned on the overall design.

API proposal

TBD later. Ideally something sexy, developer-friendly, and powerful (famous last words I know). Decorator-based if possible (even if decorators are so terribly less powerful than annotations for that kind of feature)?

Open questions

What's the priority on this?

We need to define the priority on this.

Note that this is a requirement for multitenancy Kibana, so it needs to be taken into account.

How do we manage the cohabitation with the existing system?

We can't afford to do a big bang for that kind of low-level change impacting every part of the Kibana codebase. We need a well-defined strategy on how the current DI system and the new one would be able to cohabitate during the transition (and we're talking years here).

How many changes in security/spaces will be required for the `user`/ `tenant` / `request` scopes?

Having these scopes means that the user, user principal, tenant, and tenant ID are first-class citizens of Core. How many architectural changes do we need to have that information at Core's DI level?

Do we even have the concept/information of tenantId anywhere in the platform atm?

elasticmachine commented 1 year ago

Pinging @elastic/kibana-core (Team:Core)

vadimkibana commented 1 year ago

cc @dokmic as he had ideas and PoC done on dependency injection in Kibana

jloleysens commented 1 year ago

Easy implementation switch...

I'm not sure we want to put this responsibility on the DI system. IMO the biggest responsibilities should be:

eliminating or greatly reducing refactoring efforts when changing dependencies
facilitate unlocking value across Kibana
make it hard to write spaghetti code

This is bc I think the kinds of "splits" for serverless Kibana we want to make are more along the lines of node roles and maybe a small number of specialised (micro)services. I'm not sure we'd want plugins talking to each over the wire in RPC style that doesn't involve going through ES in some way.

Just wanted to share my 2c on overall architecture convo 🙈 :

An issue in Kibana is that we have the idea of "services" and "apps" in one container: the plugin. By "apps" I mean code not intended for reuse (highly domain-specific business logic, non-reusable front end rendering etc). Apps should not depend on each other. Services (or stateful modules), OTOH, could depend on any other service, and apps could depend on any service. This might mean we have foo "app" AND foo "service", but I think that's OK: they encapsulate different things.

I think this is what you (@pgayvallet) are already getting at, but just wanted to add the "app" v. "service" distinction to get your thoughts.

jasonrhodes commented 1 year ago

Thanks for writing this up, @pgayvallet! I'm excited to collaborate on a solution.

elastic / kibana