alexkalish commented 3 years ago

User Story

As an app/workload SRE I want to provide secrets to my app’s environment using Summon, without modifying my app’s container image
So that I can easily use Summon without modifying my app or it’s image.

Additional Details

To start, a spike is required to see if this story is even feasible. If it is, additional user stories will be needed to refine the user value.
Please include security architect in review of spike results.

Implementation/Solution Notes

@doodlesbykumbi already has some notes exploring how this would work.

What remains to do is:

How can we get Summon into volume mounts in a scalable way? (eg to avoid having 10,000 Summon binaries for 10,000 volume mounts) - 2
If we standardize on providing configuration for secrets in ConfigMap, is there a way to standardize on this regardless of configuration (Secretless, Secrets Provider, Authn-K8s + Summon, Authn-K8s + Client Libraries)? - 2
What guidance can we provide around ensuring clients can trust the Summon executable that's available in the cluster? For example, what role might a dedicated Summon init container image play (or how would this work if the Summon binary install was included in the Authn-K8s client image when used as an init container or sidecar)? - 3
Can we create architecture diagrams / flow charts to show how Summon is used currently in the Kubernetes flows vs how it will work going forward? - 1
Given the research done, and in view of it all taken together, review with the security champion with the goal of determining that we can move forward / the plan meets our security standards. - 2

doodlesbykumbi commented 3 years ago

Zero-change Summon deployment proposal

The current guidance for deploying applications with Summon in Kubernetes requires the user to add a layer, to each application container image, which makes available the Summon binary and the secrets.yml file. The secrets.yml file is the configuration used by Summon to determene the secrets to inject into the application. This additional layer couples Summon with the application, and forces the user to incorporate Summon into their build pipeline. Such overhead can be a dealbreaker.

Here we propose a lighter alternative method for deploying applications with Summon in Kubernetes. We dub this method zero-change Summon deployment. At the heart of the proposal is the Kubernetes Volume. At its core, a volume is just a directory, possibly with some data in it, which is accessible to the containers in a pod. How that directory comes to be, the medium that backs it, and the contents of it are determined by the particular volume type used. The proposal is to use volumes (not necessarily one) as the repositories for the Summon binary and the secrets.yml file. This shifts the necessary changes from the container image to only in the application's Kubernetes manifest. No change to the application container image is required. This restores the promise of Summon being altogether transparent to the application.

How does it work ?

The Summon binary and the secrets.yml configuration are made available to the application container through volume mounts. The application container's command is modified, as usual, such that it is prepended with Summon and the main process runs as a child process of the Summon process which injects secrets into the main process.

The application manifest below is an example implementation of the approach. As mentioned above the medium that backs the volume is arbitrary, all that matters is that containers in the pod are able to access the volume, which is achieved using volume mounts to mount volumes to paths on the container file system.

In this example:

The secrets.yml is stored in a ConfigMap (a Volume type).
The Summon binary is stored in an emptyDir Volume. An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on that node. It exists only for the lifetime of the Pod. As the name says, the emptyDir volume is initially empty. With this volume type an init container is needed to populate the volume with the Summon binary.

---
apiVersion: v1
kind: Pod
metadata:
  ...
spec:
  initContainers:
    - name: get-summon
      ...
  containers:
    - name: app-container
      ...
      command: [ '/utils/summon', '-f', '/secrets.yml' ...]
      volumeMounts:
        - name: utils-volume
          mountPath: /utils
          readOnly: true
        - name: secretsyml
          mountPath: /secrets.yml
          subPath: secrets.yml
          readOnly: true
  volumes:
    - name: utils-volume
      emptyDir: { }
      volumes:
    - name: secretsyml
      configMap:
        name: secretsyml

Note that the secrets.yml file and Summon binary can be stored in any volume type. Some volume types are more appropriate than others. For example:

It's a bad idea to store the Summon binary in a ConfigMap since that would need the binary to be stored in etcd, and to be transported via the API server which has a default maximum payload size that exceeds the size of the Summon binary (~3.5 MB).
```
➜  ~ kubectl create configmap utils --from-file=summon=summon
Error from server (RequestEntityTooLarge): Request entity too large: limit is 3145728
```
It's a good idea to store the secrets.yml file in a ConfigMap since this is the Kubernetes resource most typically associated with handling configuration. Using an AWS ELB volume might be less appropriate.

In our proposal we recommend:

ConfigMaps for storing secrets.yml files
Ephemeral or persistent Volume for storing Summon binaries. The section below go into more detail about the pros and cons of each volume type category. The simpler option, at least for exploration purposes, would be to use ephemeral storage with an init container fetching the binary from Github, as shown in the linked gist.

How does it scale ?

Scale relates to keeping to a minimum the impact of using Summon as the number of applications grow i.e. in excess of a 1000. The potential impact of using Summon with this approach is

Consumption of storage. This depends on the Volume type used to store the Summon binary. Ephemeral volumes are the worst because each pod has its own Summon binary. Persistent storage is better because it allows Pods running on a particular node to share binaries.

The Summon binary consumes 3.5 MB per application instance. The recommended maximum is 110 Pods per node. In that situation the worst case impact on storage of using Summon is 385 MB per node. This isn't very much.

To keep storage consumption to a minimum persistent storage is an option. An example is to use the hostPath volume type. A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. Every node would need to have the directory containing the Summon binaries e.g. /etc/summon-utils. This would reduce the storage impact to just 3.5 MB per node.
Increase in pod startup time. This really only applies to the case where the Summon binary is stored in ephemeral volumes, and uses an init container to populate the volume. The increase in startup time depends entirely on the mechanism for procuring the Summon binaries. Fetching from Github from the init container will be worse than copying from an init container image that already has the binaries.

Mounting persistent volumes has minimal impact.

Retrieving and trusting Summon binaries

How the Summon binary is retrieved depends on the volume type used.

Ephemeral volume types like emptyDir need an init container to populate the volume. There are two ways in which this could be achieved.
1. The init container downloads the Summon binary at runtime. This assumes workloads have access and connectivity to the download URL, which is not guaranteed.
2. The init container uses a trusted container image that contains the binaries. The user can build their own or we might start providing these as part of Summon releases.
Persistent volume types require out of band population of the volume. For example when using the hostPath volume type you can populate the hostPath with the Summon binaries at node provision time. This might just be a bash script that pulls the release from Github and carries an integrity check.

Trust

The considerations here are the same as those for incorporating the Summon binary into the application container image. Hopefully anyone using Summon right now is already doing what we cover below.

There must be an integrity check on the Summon binary before using it. We provide checksum values (for example, see SHA256SUMS.txt at https://github.com/cyberark/summon/releases/tag/v0.8.3) with Summon releases that can be used to establish trust of any given binary we provide. For example:

wget https://github.com/cyberark/summon/releases/download/v0.8.3/summon-linux-amd64.tar.gz
echo "fc0e0feaf6ef4fb641a41762a2c76d1a282fec3f852e1141af6e3f8ab24f074f summon-linux-amd64.tar.gz"  | sha256sum -c -

It should be noted that Summon is entirely open source so users are also able to build from source, which offers the highest confidence in the code you run.

We currently only provide releases of Summon binaries. To complement this proposal we can start providing trusted container images containing Summon binaries to cater to the mechanism of populating ephemeral volumes using an init container. This would likely make use of content trust for container images, which would require us to sign our images, see https://docs.docker.com/engine/security/trust/.

andytinkham commented 3 years ago

Security-wise, I don't see any major flaws in this plan. I'd love to see a way of trust that makes it an explicitly set choice to take a less secure route rather than requiring the user to have to do an extra step to be secure, but that's not a dealbreaker here. There might be value in spelling out more detail around the type of volumes for the binary - "ephemeral or persistent volumes" doesn't seem to narrow it down any. Quickly skimming the list of potential volume types, nothing immediately jumps out as being drastically different for most volume types, so maybe there's nothing here.