elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
46 stars 707 forks source link

Ability to store References in Remote Operators for Reuse #5318

Open BenB196 opened 2 years ago

BenB196 commented 2 years ago

Proposal

I would like to propose the ability to store references to remote Elastic resource in remote operators, so that they can be reused within the different components of the stack.

Context

Currently, if an ECK operator manages an Elasticsearch cluster, Kibana, and/or Fleet server, the operator offers the ability to use references within other parts of the stack which it manages (elasticsearchRef, kibanaRef, fleetServerRef).

These refs are nice as it simplifies the configuration of parts of the stack so that only the operator needs to be aware, and will handle the rest of the configuration.

Enhancement

I'd like to have the ability to have remote ECK operators (which don't directly manage the above resources), to be able to be configured (probably via a Kubernetes secret of some sort), to be able to offer similar abilities to things that it (the remote operator) manages.

Use Case

If I have Kubernetes Cluster K-A, which is home to an Elasticsearch cluster E-A, and is managed by Operator O-A, I can easily add other resources to connect to this cluster if they are managed by O-A using references.

However, if I have Kubernetes Cluster K-B, which is just a regular cluster, and I want to have things like Beats/Elastic Agent/other stack components connect to E-A, but I have an operator on this cluster O-B to manage the components; I need to manually configure every output of every component on cluster K-B so that it can talk to E-A.

I'd like to just be able to tell operator O-B, "Hey, here is this cluster E-A, here are some credentials to connect to it so you can aid in managing it, use these for components you manage.". Then have operator O-B, be able to use references like elasticsearchRef within the components it manages, so that I don't need to manually configure stuff like outputs.

Note

I'm not exactly sure how feasible something like this would be, but I couldn't find any existing issues on the topic, so I figured I'd open this one.

pebrc commented 2 years ago

I wonder if the change proposed in the change in https://github.com/elastic/cloud-on-k8s/pull/5240 would not also solve the use case you describe.

BenB196 commented 2 years ago

@pebrc looking over the PR + original ticket, something that isn't very clear to me is; Would the secret that is defined, be used by the remote ECK operator to provision api keys and such that the components would use (like the "local" ECK operator does today). Or would I need to manually create the api keys and store them as secrets per use case

If I want to manage beats with a remote ECK operator:

  1. Would I need a secret with the ability to create api keys and such
  2. or would I need a secret that the beats could use directly for ingestion?

I'd prefer option 1 if possible, but not sure if that is the intend of the PR.

thbkrkr commented 2 years ago

PR #5240 doesn't exactly address your proposal but it will help to get there. It simplifies how to connect an external Elastic resource to n local Elastic resources. You just have to create one secret with information to reach the external resource and then references it in all the ref of the local associated resources. No more need to configure each output with all the details. However, there is no mechanism to automatically provision this secret. You have to manage it by yourself (manually, could be an automated task in a CronJob). Note that using a certificates signed by a well-known CA will save you from updating the secret each time the CA is rotated.

I like your idea though I wonder if it this is in the scope of the operator or if it should be something performed by some kind of Secret Sync Operator.

barkbay commented 2 years ago

What is described here sounds like a request for a "federation" feature: a layer for coordinating multiple ECK deployments, potentially running in multiple Kubernetes clusters. Assuming ECK operators are aware of each others it would then be possible to reference resources running in multiple distinct clusters.

It is an interesting but very challenging feature. Among many other things it would require to understand how operators are interconnected, how services are accessible in different clusters, how operators trust each others, it would also require a form of synchronization between them I guess...

I'm afraid there is no way to do that easily at the moment, we can however keep this issue opened to track the request.