Status	Open for comments 💬
Author(s)	@iameskild
Date Created	2023-01-15
Date Last updated	2023-02-06
Decision deadline	2023-02-13

Summary

See relevant discussion:

https://github.com/nebari-dev/nebari/issues/787
- SOPS discussion starting at https://github.com/nebari-dev/nebari/issues/787#issuecomment-1370480743
https://github.com/nebari-dev/nebari/issues/1547

Design Proposal

SOPS is a command-line tool for encrypting and decrypting secrets on your local machine.

In the context of Nebari, SOPS can potentially solve the following high-level issues:

allow Nebari administrator to manage sensitive secrets
- this includes the abiilty to store these secrets in git as part of a GitOps workflow
create (shared) kubernetes secrets that can be mounted to JupyterLab pods and other kubernetes resources
- this requires some additional work but should be worth the effort

Workflow

Starting point: a Nebari admin has a new secret some of their users may need (such as credentials for external data source). They have the appropriate cloud credentials available.

Generate KMS (or PGP) - only needs to be performed once
Encrypts the secret locally
Add the encrypted secret to the Nebari infrastructure folder
Redeploy Nebari in order to create Kubernetes secrets and associate those secrets with resources that need them

Handling secrets locally

Item 1. and 2. from the workflow outlined above can be performed directly using the cloud provider CLI (aws kms create-key) and the SOPS CLI (sops --encrypt <file>).

To make it easier for Nebari admins, I propose we add a new CLI command, nebari secret to handle items 1. and 2. This might look something like:

# requires cloud credentials
nebari secret create-kms-key -c nebari-config.yaml --name <kms-name>

This command would call the cloud provider API and generate the necessary KMS. In the process, this command could also generate the .sops.yaml configuration file to store the KMS and creation_rules.
It looks like SOPS doesn't have support for DO KMS (or DO doesn't have a KMS product?) and will likely need to rely on another method PGP / age keys.
Local deployments should also rely on PGP / age keys.

# encrypt secrets stored as a file
nebari secret encrypt --name <secret-name> --file <path/to/file>
# or from a literal string
nebari secret encrypt --name <secret-name> --literal <tOkeN>

# a decrypt command can be included as well
nebari secret decrypt --name <secret-name>

The encrypt command encrypts the secret and stores the encrypted secret in the designated location in the repo (./secrets.yaml).
The decrypt command decrypts the secret and prints it stdout.
Anyone performing this command on their local machine must have a cloud user that can use that KMS key.

Include these secrets in the Nebari cluster

Items 3. and 4. from the workflow outlined above refers to how to get these secrets included in the Nebari cluster so that they can be used by those who need them.

There exists this SOPS terraform provider which can decrypt these encrypted secrets during the deployment. To grab these secrets and use them, we can create a secrets module in stage/07-kubernetes-services that returns the output (i.e. secret) that can be used to create kubernetes_secrets as such:

Read/decrypt the data from the secret.yaml:


data "sops_file" "secrets" {
source_file = "/path/to/secrets.yaml"
}

output "my-password" { value = data.sops_file.demo-secret.data["password"] sensitive = true }


2. Consume above output to create Kubernetes secret (in parent module):

resource "kubernetes_secret" "k8s-secret" { metadata { name = "sops-demo-secret" } data = { username = module.sops.my-password } }


At this point, the kubernetes secrets exist (encoded, NOT encrypted) on the Nebari cluster.

#### Including the secrets in the user's environment

Including secrets in the [KubeSpawner's `c.extra_pod_config`](https://jupyterhub-kubespawner.readthedocs.io/en/latest/spawner.html#extra_pod_config:~:text=label%2Dnames.-,extra_pod_config,-c.KubeSpawner.extra_pod_config))  (in [`03-profiles.py`](https://github.com/nebari-dev/nebari/blob/develop/nebari/template/stages/07-kubernetes-services/modules/kubernetes/services/jupyterhub/files/jupyterhub/03-profiles.py)) will allow us to mount those secrets to the JupyterLab's user pod, thereby making them useable by the people.

c.extra_pod_config = {

as environment variables

"containers": [
    "env": {}
]
# to pull images from private registries
"image_pull_secret": {}
# as mounted files
"volumes": [
    "secret": {}
]

}


How these secrets are configured on the pod (as a file, env var, etc.), and which Keycloak groups have access to these secrests (if we want to add some basic "role-based access"), can be configured in the `nebari-config.yaml`. 

Something like this:

secrets:

name:
type: file keycloak_group_access:
- admin
name:
type: image_pull_secret ...

To accomplish this, we will need to add another callable that is used in the `c.kube_spawner_overrides in 03-profile.py:render_profiles.

Alternatives or approaches considered (if any)

There are many specifics that can be modified, such as how users are granted access or how the secrets that are consumed by the deployment.

As for a different usage of SOPS, I can think of one more. That would be to create the kubernetes secret from the encrypted file directly and then have the users decrypt the secret in their JupyterLab pod. This would eliminate the need for the sops-terraform-provider above.

It might be possible to create tiered- secret files that are then associated to the keycloak groups again. This would introduce multiple KMS-keys.

The question that's hard to answer then becomes how to safely and conveniently disperse the KMS key to those who need to access the secrets.

Best practices

User impact

Access to secrets they may need to access job specific resources.

Unresolved questions

Given that SOPS is a GitOps tool, it's important to ensure that admins don't accidentally commit plain text secret files in their repos. Adding more strict filters in the .gitignore will help a little but there's always a chance for mistakes.

Thanks for taking the time to write such a thorough RFD @iameskild Thoughts below:

To make it easier for Nebari admins, I propose we add a new CLI command, nebari secret

I have no objection here - though since this will be a wrapper around some other CLIs, we need to make sure our docs cover in-depth documentation to also complete this without having to use our CLI (because some folks are pretty cautious, especially around access and security)

This command would call the cloud provider API and generate the necessary KMS. In the process, this command could also generate the .sops.yaml configuration file

It seems reasonable to me; we would need to decide on what that .sops.yaml will look like (i.e. key/values and sane defaults)

+1 on using SOPS terraform provider

Including secrets in the KubeSpawner's c.extra_pod_config) (in 03-profiles.py) will allow us to mount those secrets to the JupyterLab's user pod, thereby making them useable by the people. How these secrets are configured on the pod (as a file, env var, etc.), and which Keycloak groups have access to these secrests (if we want to add some basic "role-based access"), can be configured in the nebari-config.yaml.

This makes sense and would help with access to external data sources or other resources. To ensure I am aligned with your proposal, the idea would be that any secret can be made available to specific roles in keycloak thus, to all those users within those groups. But at the same time, the users would not be able to say delete/modify the secrets themselves. In such a case, it would be best to have these as some sort of env variable.

That would be to create the Kubernetes secret from the encrypted file directly and then have the users decrypt the secret in their JupyterLab pod

-1 on this as it introduces another workflow/tool that the end-users have to deal with/familiarise. I prefer abstracting this as much as possible on the backend side to minimise end-user overhead.

Given that SOPS is a GitOps tool, it's important to ensure that admins don't accidentally commit plain text secret files in their repos. Adding more strict filters in the .gitignore will help a little but there's always a chance for mistakes.

We must do our due diligence, provide in-depth documentation, and suggest best practices.

Since this has been open for a while I am moving this RFD towards decision/voting. We are using a consent decision-making process for RFD approval therefore:

[ ] 07-08 Feb 2023: Read through the proposal and ask any clarifying questions (everyone makes sure to read and understand the proposal and ask any questions needed to understand this better). Also, note your overall feeling about the proposal (i.e. support, do not support)
[ ] 07-08 Feb 2023: Allow @iameskild as the author to reply to questions and make any amendments
[ ] 09-10 Feb 2023: decision-making round: @trallard will kick off the consent decision-making round, integrate objections and resolve objections
[ ] 13 Feb 2023: Ratify decision

cc @nebari-dev/maintainers

@trallard I should have raised this earlier but didn't get to try out sops until around 5 days ago. I've looked at the encryption methods in sops and the most likely one we would push on our users is age. Because Age is the only one that supports ssh keys. Age supports ssh and pub/private keys. However sops does not support multiple keys with age (see https://github.com/mozilla/sops/issues/1078).

From a user usability perspective age with ssh IMO is the way to go. We cannot expect our users to create gpg keys, nor cloud keys. I see the following workflow:

nebari init ..... which github users would you like access to secrets?
nebari fetches the https://github.com/<username>.keys and adds to an age recipients file
Age creates a pub/private key only for the ci and adds to recipients file
Nebari then creates a file and uses age to then encrypt all secrets using those public keys in recipients file
Done
To decrypt a user simply runs age -d <filename> and can optionally supply -i <path-to-ssh-github-key>.

Why can we not just directly use age? The key benefit I see is that we can specify a recipients file with multiple keys and can add/remove users from access to the secrets.

To my knowledge sops does not support this workflow. The one thing we would lose with age is that it encrypts the entire file (which to me personally I've never understood encrypting only the keys).

User Story 1: I'm a software engineer doing client work. I have ever-changing client credentials to various web services (most notably s3 and gcs). I am looking for a more secure way to store my creds rather than in an .env file on my machine/pod/NFS mount. I want to be assured that no one else can access these secrets. I shouldn't need to redeploy the entire nebari platform to set my secret. I shouldn't need a platform admin to set up my secrets.

User Story 2: I want to use Argo for my workflows, but the only to pass anything out of Argo is to store credentials - either in the workflow pod as env vars or in the docker image. Neither one is an effective long-term solution. When the creds are stored as env vars, they are visible to anyone who inspects the pod spec via k9s. One argument may be that whoever has access to the pod spec would also have access to the k8s secrets (one of the suggested solutions). There are some nuances here that others may need to expand on (@costrouc @Adam-D-Lewis) .

I'm cleaning up the RFD's and I'm going to close this for now due to no activity, but feel free to re-open if needed!

nebari-dev / governance

RFD - Include SOPS for secret management #29