k8ssandra / k8ssandra-operator

The Kubernetes operator for K8ssandra
https://k8ssandra.io/
Apache License 2.0
158 stars 74 forks source link

K8SSAND-1533 ⁃ Modular Secrets Backend #556

Open sseidman opened 2 years ago

sseidman commented 2 years ago

What is missing?

The only supported option for secret storage of authentication credentials is through Kubernetes Secrets. It would be great if there was modular support for different/external secret backends (e.g. Vault) that would allow for the current secret storage to work as is, but also provide alternative options if desired.

Why do we need it?

Kuberenetes Secrets are unencrypted and have a list of potential security risks. Introducing a modular secrets backend would allow for users to configure a secret backend storage system that meets their security needs, while still providing the default out-of-the-box option

Environment

All major cloud environments

K8ssandra Operator version:

`k8ssandra-operator:v1.1.1`**Anything else we need to know?**:

The interface would ideally be a drop-in interface within the ReconcileSecret() function k8ssandra-operator/replicated.go at 013df82bd7e4f50c8ee733b0418a9f3807545055 · k8ssandra/k8ssandra-operator which is called during K8ssandraCluster Reconciliation from:

These user/pass secrets are mounted as environment variables within the medusa container and reaper container and therefore need to be mounted/injected from the external secret store. If the credentials need to be injected as something other than an environment variable (such as through mounted volume) these secrets should not be created:

CQL users are specified/found in the CassandraDatacenter config by secret reference and also references encryption keys by secrets. The cass-operator would need to be aware of the configured secrets backend and where to retrieve the users/certs from instead.

┆Issue is synchronized with this Jira Story by Unito

mikelococo commented 2 years ago

FYI, we're still working a bit on honing this proposal. In particular I think there's some open questions around how to "mount" (or provide access to the secrets within each cassandra pod) the secrets without using k8s-secrets. My own intuition is that we would want to:

mikelococo commented 2 years ago

Does vault support "watch" APIs for secrets? I'm thinking of credential rotation and I believe that https://www.vaultproject.io/docs/secrets/databases/cassandra supports automatically rotating credentials. If a backend is doing automatic rotation, it may not be enough to fetch the secrets on operator startup, we may need to watch them... update them... and then possibly do a rolling restart on any pods that mount them?

Edit: It looks like as of 2018 Vault did not have the ability to watch a path (without interacting with the storage backend... which I don't think it viable here as vault itself has many modular backends and probably few organizations would want operators mucking about with direct access to them): https://github.com/hashicorp/vault/issues/616

Edit2: I could still imagine there being room for a watch-oriented API in the modular abstraction, even if some secret backends implement that watch via a periodic polling mechanism.

jsanda commented 2 years ago

Thanks for creating the issue :)

Kuberenetes Secrets are unencrypted and have a list of potential security risks.

Would it be possible to encrypt the secret contents and then provide the key to consumers of the secret in a secure way? (credit to @jeffbanks for the question 🙂 )

Can you give a high level explanation of how using credentials from an external provider works? I'm struggling to grok this as it's something I haven't done before. Will each container need to make a call to Vault to get the credentials and store them in environment variables?

Replicate the "mounting" machinery that comes built-in to k8s secrets, presumably by doing something like mounting them to an appropriately permissions-restricted file and then using an init-container to load the contents of that file into environment variables.

Would those environment variables be visible to other containers?

Something else to keep keep in mind. All of the secrets under consideration are created in the control plane cluster and replicated to the data plane clusters. The secrets may be used in the local cluster, but they might also be used in remote clusters. By default, any secret created in the namespace that the operator is watching and that has k8ssandra.io/cluster-name and k8ssandra.io/cluster-namespace labels will get replicated to data plane clusters. Check out secret_controller.go.

sseidman commented 2 years ago

What we are proposing?

Why we’re proposing this

As mentioned in the original issue, Kubernetes Secrets do not provide a secure way of storing secrets that meets our current needs. As an organization, we require a secret storage backend that provides us with a fine-grained audit trail that details all requests and responses to the system. Our secret storage system goes through a demanding security review and needs to meet strict compliance standards, which requires a substantial amount of operational overhead to achieve all of these requirements. We can avoid needing to implement the same security standards for Kuberenetes Secrets if we do not engage in using them as a secondary source of secret storage within our systems. Additionally, we require consistent mechanisms and policies around credential rotation. It’s one thing to have these features in the operator, and many unopinionated orgs will appreciate that. But opinionated orgs will want to employ their “standard” rotation policies using their standard rotation mechanisms… and these mechanisms will integrate strongly with previously mention auditability and compliance systems. As such, simply encrypting k8s secrets doesn’t really move the needle on why we’re doing this. It also doesn’t solve the secret-storage problem, as you’re left with a secret-encrypting-secret which has the same problems we had with our original handful of secrets

What it looks like

Config option Similar to the auth flag at the top level K8ssandrCluster spec, there would need to be an additional flag such as externally_injected_secrets, which would default to false. When false the operator will create secrets for all superusers and replicate them across the different clusters just like it does now. When enabled, the operator will not create the secrets for any superuser and it will be the user’s responsibility to provide those secrets.

Where will the secrets be Since the operator will not be creating the secrets, it should expect that the superuser credentials are already available within the operator and should be accessible as needed. This means, there needs to be a default (or configurable) file location where the secrets will be written to on initialization that the operator will have read access to. This means that the operator will be agnostic to where the secrets are retrieved from and instead only requires that they end up in a common location such as /etc/cassandra/superuser_credentials.

Reconciliation Currently if auth is enabled, the operators will generate superuser credentials, if they don’t already exist, and apply them as Kubernetes secrets. The secret names are stored within the CassandraDatacenter spec as a reference that can be used to retrieve the secrets when actually applying the CQL command to create the secrets. Instead, the operator will check the externally injected secrets configuration flag and, if enabled, look to read the secrets from the file. If the secrets do not exist or do not have the expected form within the file, the operator should return an error and re-queue the reconciliation task until the secrets file has been populated. And if the secrets file is available, the operator should continue to set the CQL superuser password to match the provided credential string, as it does today.

Generation/Population Since the credentials are expected to mounted to the operator as a file, the user will require some mechanism to populate this file. By giving users access to init-containers within the operators, a user can use their own custom image that contains the necessary arguments and logic to authenticate and interact with their own custom secrets backend, with the stipulation that the retrieved secrets need to be mounted to the operator as a file in the location the operator is expecting and in the proper format within the file.

Downstream storage The medusa and reaper pods both expect that the superuser credentials will be available locally as environment variables within each of the pods. This allows for the credentials to be loaded into the application configuration at runtime. This should still be expected and the user again will need to customize their init-containers so that the credentials are injected into the pod and populate the requisite environment variables. This is actually a little tricky and may require some additional thought. The proper way to do this with an init-container is to write the secrets to a script that will export the environment variables. The main container would then need an additional command to source the script created by the init-container so that the environment variables are populated. A possible alternative to this would be to mount the k8s secrets as a file to each of the containers instead of as environment variables. In this case, the application level code would require some changes since it would expect the credentials in a file instead of environment variables.

Validation The validation steps that check for the existence of secrets (cass-operator/handler.go at 942bd735ca59dabc95d3169787ceaeba8d415cc1 · k8ssandra/cass-operator ) will need to be disabled so that the CRD specs are considered valid and do not prevent the reconciliation loops from completing successfully.

Known constraints

When secrets are disabled in the configuration, the user will have the take responsibility for replicating the secrets across the Kubernetes clusters. Since the user will be using their own logic to inject the credentials into the pod, they’ll also have the responsibility of provisioning and replicating those secrets as necessary within their secrets management system before deploying the k8ssandra-operator and creating a K8ssandraCluster. Additionally, whatever file/env-var interfaces we expose become public config interfaces and shouldn’t churn unnecessarily across k8ssandra versions.

Additional Proposals

A modular secrets backend was our original intention with this issue, but the implementation of such a solution quickly gets complex and would result in additional configurations for the operator, additional backends to support, and potentially leaky abstractions. This proposal would require the operator to gain the ability to fetch/set secrets in other storage backends. The supported backends would also need to support some form of replication and therefore the operator would need access to replicate those secrets within the storage system. Finally, there would need to be a supported way to mount/inject the secrets that could be used as a drop-in replacement to the current mechanism of mounting the secrets as environment variables without relying on Kubernetes Secrets to do so.

Instead, by allowing the user to selectively disable the creation of secrets, it moves the responsibility of provisioning credentials and injecting those credentials into the pod up to the user instead of the operator. This allows a user to use their external backend of choice and additionally only requires some mechanism to inject those credentials into the operator and dependent pods.

mikelococo commented 2 years ago

There was some good discord chat about this, which I'll briefly summarize here...

mikelococo commented 2 years ago

This has evolved into a proposal documented at https://docs.google.com/document/d/1zSwkWhylXMk7mDmjkq4ArvNmwDRs-CEjRuYu42KcXg8/edit.

The crux of the proposal is that we'll introduce the notion of an "internal secrets provider" that retains all our current secrets-management behaviors, and we'll introduce an "external secrets provider" that has a much higher barrier of entry (requiring you to handle your own secrets creation/rotation... and requiring you to use the vault-agent or create your own mutating webhook to inject secrets from whatever enterprise-secret-store you're using) and leverages kubernetes dynamic-admission-control/mutating webhooks to inject the secrets into the contains that need them in a way that is mostly transparent to the operator... which just needs to annotate the various containers with metadata that the mutating webhook can read to know what to inject. This is all significantly inspired by the Hashicorp Vault Agent for Kubernetes.

This will involve changes to k8ssandra, reaper, medusa, and possible cass-operator and probably there are some small design/interface details to be worked out as we start learning from implementation... but we have the broad-shape of the work needed planned out.

Some of the folks working on this will be first-time contributors, but all quite familiar with Cassandra and Kubernetes so hopefully they'll be able to make good independent progress.

adejanovski commented 2 years ago

Thanks for the update @mikelococo !

FTR, work on this will be tracked in this epic.