Open MickinSahni opened 7 years ago
So before I dive head-first into working on this, I want to make sure I've got the requirements right...
Let's use the example of some generic API key that an app developer wants to keep private. Right now, the process would be that the developer would add that key as a configuration variable (e.g. via the configurations view), saving it to the soapbox database config_vars
table in plain text. The ideal state would be that the variable would be encrypted and then saved to an S3 resource, with the soapbox database only storing a pointer to that S3 resource.
As for using KMS... KMS is not a standalone service, but an additional tool of other services such as S3. From the AWS developer guide:
"There are two ways to use AWS KMS with Amazon S3. You can use server-side encryption to protect your data with a customer master key or you can use a AWS KMS customer master key with the Amazon S3 encryption client to protect your data on the client side."
If I understand that correctly, it essentially provides the option to either encrypt the data at rest on S3 or to encrypt the data using a client in the AWS SDK before uploading. Since the configuration data we want to send to and receive from S3 is sensitive, we'd take the latter approach.
Some questions I have:
config_vars
table schema, perhaps adding a column to denote the sensitivity of the config.soapbox-app-secrets
, which would contain a subdirectory for each application like the soapbox-app-images
and soapbox-app-tf-state
buckets do? These subdirectories could contain one file encrypted by KMS which would contain all configurations.Lastly, can you expand a bit more on:
Ideally, app gets secrets from a service at boot-time, and secrets are not exposed on the running instance.
Namely, I'd like to get an idea of how this would work... how do we want to expose the configurations to the application after they've been fetched from S3 and decrypted?
EDIT: Just realized... since configurations are based on environments, the project subdirectories under the proposed soapbox-app-secrets
bucket would need to be split up further by environments like the soapbox-app-tf-state
bucket structure, e.g.:
soapbox-app-secrets
├── project_1
│ ├── environment_1
│ │ ├── configuration_v1
│ │ └── configuration_v2
│ └── environment_2
└── project_2
@LouisFettet:
Some questions I have:
Should all configuration variables be encrypted using KMS? Or should we be able to distinguish between private (sensitive) and public (non-sensitive) keys? Depending on the answer to this, we'll need to change the config_vars table schema, perhaps adding a column to denote the sensitivity of the config.
I think the answer is yes, all configuration variables should be encrypted. It's the simpler solution, to treat all as secret, than have two different schemes for secret and non-secret. Having two different schemes is both a more complicated implementation, since we'd have to handle both, and puts a cognitive burden on the end-user, to make a judgment call about each variable's sensitivity (and they could choose wrongly).
Where would be the best place to store the data we want encrypted? Looking at the current pattern we're using in S3, would it make sense to create another root bucket called soapbox-app-secrets, which would contain a subdirectory for each application like the soapbox-app-images and soapbox-app-tf-state buckets do? These subdirectories could contain one file encrypted by KMS which would contain all configurations.
Yes, I think an S3 bucket for app secrets makes the most sense. Like you noted in your edit, the bucket should be hierarchically structured, by app slug, then by environment slug. And adding the version number to the file is correct -- we'll want to do that in a parse-safe way.
Lastly, can you expand a bit more on:
Ideally, app gets secrets from a service at boot-time, and secrets are not exposed on the running instance. Namely, I'd like to get an idea of how this would work... how do we want to expose the configurations to the application after they've been fetched from S3 and decrypted?
I think we should do this in two steps: first, is to implement the secure storage with S3 + KMS, and transition the app over to using it. Soapbox should download, decrypt, and populate the runit env dir exactly like is done now. This will leave the secrets in plaintext on disk, but that's an okay intermediate state.
Then, second, we should come up with a means of fetching, decrypting, and injecting the config vars into the application process's environment at startup time, without having them be on disk -- this might mean introducing a wrapper programmer that does this work and then invokes the main application. But we should open and track a separate ticket for this, after the first step is done.
Another point, may be obvious but worth stating explicitly: we should generate a KMS master key when a new application is created. Down the road, we might want to add some UI that let's a user invalidate and/or rotate their key (re-keying all extant secrets), and we might also want to use a derived key (i.e., a separately generated random key, which is used to do the actual encryption, and which is encrypted with the KMS master key and stored alongside the secrets -- this would let a user encrypted more than 4 KB worth of secrets, but for now, a straight master key usage is fine).
Hey @LouisFettet, this ticket in a good place for someone else to pick up?
@MickinSahni yup! Just updated the PR, so I think someone else could probably take it from here. Next step is to be able to download & decrypt the configurations from S3. I think I was pretty thorough with documenting the PR, but I'm more than happy to make myself available for Q&A.