moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
69.75k stars 18.75k forks source link

Secrets: write-up best practices, do's and don'ts, roadmap #13490

Open thaJeztah opened 9 years ago

thaJeztah commented 9 years ago

Handling secrets (passwords, keys and related) in Docker is a recurring topic. Many pull-requests have been 'hijacked' by people wanting to (mis)use a specific feature for handling secrets.

So far, we only discourage people to use those features, because they're either provenly insecure, or not designed for handling secrets, hence "possibly" insecure. We don't offer them real alternatives, at least, not for all situations and if, then without a practical example.

I just think "secrets" is something that has been left lingering for too long. This results in users (mis)using features that are not designed for this (with the side effect that discussions get polluted with feature requests in this area) and making them jump through hoops just to be able to work with secrets.

Features / hacks that are (mis)used for secrets

This list is probably incomplete, but worth a mention

The above should be written / designed with both build-time and run-time secrets in mind

@calavera created a quick-and-dirty proof-of-concept on how the new Volume-Drivers (https://github.com/docker/docker/pull/13161) could be used for this; https://github.com/calavera/docker-volume-keywhiz-fs

Note: Environment variables are used as the de-facto standard to pass configuration/settings, including secrets to containers. This includes official images on Docker Hub (e.g. MySQL, WordPress, PostgreSQL). These images should adopt the new 'best practices' when written/implemented.

In good tradition, here are some older proposals for handling secrets;

thaJeztah commented 9 years ago

ping @ewindisch @diogomonica @NathanMcCauley This is just a quick write-up. Feel free to modify/update the description if you think that's nescessary :)

dreamcat4 commented 9 years ago

This is useful infos:

https://github.com/hashicorp/vault/issues/165

As is this:

https://github.com/hashicorp/vault/issues/164

thaJeztah commented 9 years ago

@dreamcat4 there are some plans to implement a generic "secrets API", which would allow you to use either Vault, or Keywiz or you-name-it with Docker, but all in the same way. It's just an early thought, so it will require additional research.

dreamcat4 commented 9 years ago

@thaJeztah Yep Sorry I don't want to detract from those efforts / discussion in any way. I am more thinking maybe it's a useful exercise also (as part of that longer process and while we are waiting) to see how far we can get right now. Then it shows up more clearly to others the limits and deficiencies in current process. What underlying is missing and needed the most to be added to improve the secrets.

Also it's worth considering about the different situations of run-time secrets VS build-time secrets. For which there is also an area overlap area.

And perhaps also (for docker) we may also be worth to consider limitations (pros/cons) between solutions that provide a mechanism to handle the secrets "in-memory". As opposed to a more heavily file-based secrets methods or network based ones e.g. local secrets server. Which are the current hacks on the table (until proper secrets API). This can help us to understand some of the unique value (for example of stronger security) added by a docker secrets API which could not otherwise be achieved by using hacks on top of the current docker feature set. However I am not a security expert. So I cannot really comment on those things with such a great certainty.

thaJeztah commented 9 years ago

@dreamcat4 yes, you're right; for the short term, those links are indeed useful.

Also it's worth considering about the different situations of run-time secrets VS build-time secrets. For which there is also an area overlap area.

Thanks! I think I had that in my original description, must have gotten lost in the process. I will add a bullet

However I am not a security expert.

Neither am I, that's why I "pinged" the security maintainers; IMO, this should be something written by them 😇

diogomonica commented 9 years ago

@thaJeztah great summary. I'll try to poke at this whenever I find some time.

thaJeztah commented 9 years ago

@diogomonica although not directly related, there a long open feature request for forwarding SSH key agent during build; https://github.com/docker/docker/issues/6396 given the number of comments, it would be good to give that some thinking too. (If even to take a decision on it whether or not it can/should be implemented)

ebuchman commented 9 years ago

Assuming you could mount volumes as user other than root (I know it's impossible, but humour me), would that be a favourable approach to getting secrets into containers?

If so, I'd advocate for an alternative to -v host_dir:image_dir that expects the use of a data-only container and might look like -vc host_dir:image_dir (ie. volume-copy) wherein the contents of host_dir are copied into the image_dir volume on the data-only container.

We could then emphasize a secure-data-only containers paradigm and allow those volumes to be encrypted

kepkin commented 9 years ago

I've recently read a good article about that from @jrslv where he propose to build a special docker image with secrets just to build your app, and than build another image for distribution using results from running build image.

So you have two Dockerfiles:

Now we can build our distribution like that:

# !/bin/sh
docker build -t hello-world-build -f Dockerfile.build .
docker run hello-world-build >build.tar.gz 
docker build -t hello-world -f Dockerfile.dist ^

Your secrets are safe, as you never push hello-world-build image.

I recommend to read @jrslv article for more details http://resources.codeship.com/ebooks/continuous-integration-continuous-delivery-with-docker

lamroger commented 9 years ago

Thanks for sharing @kepkin ! Just finished reading the article. Really concise!

I like the idea of exporting the files and loading them in through a separate Dockerfile. It feels like squashing without the "intermediate layers being in the build cache" issue.

However, I'm nervous that it'll complicate development and might require a third Dockerfile for simplicity.

TomasTomecek commented 9 years ago

@kepkin no offense but that doesn't make any sense. Secrets are definitely not safe, since they are in the tarball and the tarball is being ADDed to production image -- even if you remove the tarball, without squashing, it will leak in some layer.

thaJeztah commented 9 years ago

@TomasTomecek if I understand the example correctly, the tarball is not the image-layers, but just the binary that was built inside the build container. See for example; https://github.com/docker-library/hello-world/blob/master/update.sh (no secrets involved here, but just a simple example of a build container)

kepkin commented 9 years ago

@TomasTomecek I'm talking about secrets for building Docker image. For instance, you need to pass ssh key to checkout source code from your private GitHub repository. And the tarball contains only build artifacts but doesn't contain GitHub key.

TomasTomecek commented 9 years ago

@kepkin right, now I read your post again and can see it. Sorry about that. Unfortunately it doesn't solve the issue when you need secrets during deployment/building the distribution image (e.g. fetching artifacts and authenticating with artifact service). But it's definitely a good solution for separation between build process and release process.

kepkin commented 9 years ago

@TomasTomecek that's exactly how I fetch artifacts actually.

In Docker.build image I download some binary dependencies from Amazon S3 image which require AWS key & secret. After retrieving and building, I create a tarball with everything I need.

jacobdr commented 9 years ago

Is there a canonical "best practices" article -- the "Do"s as apprised to the "Don'ts" -- that y'all would recommend reading?

afeld commented 9 years ago

Worth noting (for anyone else like me that is stumbling upon this) that Docker Compose has support for an env_file option.

https://docs.docker.com/compose/compose-file/#env-file

thaJeztah commented 9 years ago

@afeld docker itself has this feature as well, see http://docs.docker.com/engine/reference/commandline/run/#set-environment-variables-e-env-env-file but those env-vars will still show up in the same places, so don't make a difference w.r.t "leaking"

sebastian-philipp commented 9 years ago

I've stumbled across this cheat sheet: http://container-solutions.com/content/uploads/2015/06/15.06.15_DockerCheatSheet_A2.pdf

hmalphettes commented 9 years ago

@kepkin this is how I pass an ssh-key to docker build:

# serve the ssh private key once over http on a private port.
which ncat
if [ "$?" = "0" ]; then
  ncat -lp 8000 < $HOME/.ssh/id_rsa &
else
  nc -lp 8000 < $HOME/.ssh/id_rsa &
fi
nc_pid=$!
docker build --no-cache -t bob/app .
kill $nc_pid || true

and inside the Dockerfile where 172.17.0.1 is the docker gateway IP:

RUN \
  mkdir -p /root/.ssh && \
  curl -s http://172.17.0.1:8000 > /root/.ssh/id_rsa && \
  chmod 600 /root/.ssh/id_rsa && chmod 700 /root/.ssh && \
  ssh-keyscan -t rsa,dsa github.com > ~/.ssh/known_hosts && \
  git clone --depth 1 --single-branch --branch prod git@github.bob/app.git . && \
  npm i --production && \
  ... && \
  rm -rf /root/.npm /root/.node-gyp /root/.ssh

If someone has something simpler let us know.

jdmarshall commented 9 years ago

So what's the current status of this?

All summer there were long conversational chains, indicating just how widespread this concern is. This was filed in May, and it's still open. For instance, how would I set the password for Postgres?

blaggacao commented 9 years ago

@thaJeztah What can be done to move this forward? I guess many eyes throughout different downstream projects are on this issue... ej. https://github.com/rancher/rancher/issues/1269

demarant commented 9 years ago

I guess what is being done here is kept secret :D

Chili-Man commented 9 years ago

This the biggest pain point for us for integrating Docker into our production stack. Is there a roadmap or another doc somewhere that points to any progress towards this ?

gtirloni commented 9 years ago

Some relevant content on this topic from k8s.

CameronGo commented 9 years ago

What do you think of this as a potential way of addressing run-time secrets? https://github.com/docker/docker/issues/19508

jdmarshall commented 9 years ago

I feel like this issue would be best addressed by concentrating on a few scenarios that need to be supported, and making sure there's a set of instructions for each one. How they get implemented is less important than whether at the end of the process there's a coherent set of features that can be combined to fill the need.

A few that I've seen referred to that seem to be pretty legitimate concerns include:

Run-time Credentials

When I say 'easy' I mean that there is an ergonomically sane approach to handling these variables that protects the user from accidentally doing the wrong thing and triggering a security bulletin. The stress of the experience often becomes associated with (read: blamed on) the tools involved in the mistake.

Build-time Credentials

1st Edit:

Documentation of what is and is not 'leaked' into a typical image, container

I feel like I'm missing a couple of big ones here. Anybody got something I forgot?

dreamcat4 commented 9 years ago

API Keys for whatever json services.

For example (and this is my real use-case), Docker build compiles a program, the API Key is necessary to authenticate me and upload the build product(s) to Bintray.com.

jdmarshall commented 9 years ago

@dreamcat4 I could be way off from what you're saying, but here goes:

Are you talking about using docker images for Continuous Deployment builds, and pushing the build artifacts to an archive at the end of a successful build? Personally I prefer doing this farther up stream (eg, a post-build script in Jenkins), but if you're cross compiling that might be a bit trickier.

In my world the build agent just builds binaries/archives and retains them as 'artifacts' of the build process, and something else pushes those out to infrastructure, tags the git repository, etc. That gives me an emergency backup of the artifacts if I have a production issue and, say, my npm, docker, or Artifactory repository is down for upgrades, or the network is glitching on me.

dreamcat4 commented 9 years ago

The point I was trying to make was about usage of API keys in general. There are many different and varied online JSON / rest services which a container may need to interact with (either at build time or run time)... which requires API keys. Does not have to be specifically about build related.

jdmarshall commented 9 years ago

@dreamcat oh, so auth tokens for REST endpoints? Do you think those are handled substantially differently than, say, your postgres password in a conf file, or would you handle them similarly?

dreamcat4 commented 9 years ago

Yeah I think those two types should be considered differently in terms of evaluating thier base minimum level of security.

API Auth tokens tend to often be:

Passwords tend to be / are often:

So that does not necessarily mean the secrets solution must be different for those 2 types. Just that the acceptable minimum baseline level of security may be a little bit lower for API keys.

This minimum level matters if having strong security is more complex / problematic to setup. (which may be true here in the case of docker secrets, or not depending how feasible / elegant the solution).

And occasionally API keys of passwords can have stronger / weaker security. Just that if one-size-fits-all is not possible.

For example - my bintray API key: that is held in same .git repo as my Dockerfile. So to keep it secure is held in PRIVATE git repo (accessed via SSH). So gaining access is to the API key relatively well protected there. However without docker having any build-in secrets functionality / protections of its own, the built docker image always includes the API key in plain text. Therefore the resulting Docker build image must be kept private like the git repository... which has a knock-on (undesirable effect) that nobody else can publically view / see the build logs / build status there.

Now that is not ideal in many ways. But the overall solution is simple enough and actually works (as in: yesterday). If there were a better mechanism made in future, I would consider switching to it. But not if that mechanism was significantly more costly / complex to setup than the current solution I have already made. So therefore extra-strong security (althogh welcome) might be overkill in the case of just 1 api key. Which just merely needs to be kept out of docker's image layers cache with some kind of a new NOCAHCE option / Dockerfile command.

Wheras a password needs something like vault or ansible-vault, and to be encrypted with yet-another-password or other strongly secure authentication mechanism. (Which we would hope not but may be a complex thing to setup).

blaggacao commented 9 years ago

I think a client/server model (like in vault) for managing and streamlining (read: auditing, break-glass) all the secrets related stuff would be good practice and would cover most of the use cases, if implementation was done thoughtfully. I, personally, am not a fan of adopting a non-holistic approach, because this is an opportunity to rise the bar in best practices.

This implies a long running client (responsibility of the person that deploys an image) and/or a build-time client (responsibility of the builder). Maybe the former one could be transferred to the docker daemon somehow which provides authorized secrets at run time.

iangelov commented 9 years ago

Indeed - I wholeheartedly agree with the previous comment. Not that I don't admire the creative ways in which people are solving the problem, but I don't think this is how it needs to be - let's try and think of a solution that could be used both during CI/D and runtime, as well as taking into account that containers might be orchestrated by Mesos/Kubernetes, etc.

jdmarshall commented 9 years ago

Well, I think a bit of documentation would still be useful here, since Docker presents a few extra kinks in the problem space.

It looks like maybe the Vault guys are also looking at this from their end. I think this ticket is the one to watch:

https://github.com/hashicorp/vault/issues/165

Maybe this is something that could be collaborated upon.

blaggacao commented 9 years ago

@jdmarshall

Maybe this is something that could be collaborated upon.

+1

thalesfsp commented 9 years ago

+1 Docker + Hashi Corp Vault

gittycat commented 9 years ago

Sorry but I don't like how the solutions are getting more complex as more people pitch in. Hashi Corp Vault for instance is a full client server solution with encrypted back end storage. That adds considerable more moving parts. I'm sure some use cases demand this level of complexity but I doubt most would. If the competing solution is to use host environment variables I'm fairly sure which will end up being used by the majority of developers.

I'm looking at a solution that covers development (eg: github keys) and deployment (eg: nginx cert keys, db credentials). I don't want to pollute the host with env vars or build tools and of course no secrets should end up in github (unencrypted) or a docker image directory, even a private one.

dreamcat4 commented 9 years ago

@gittycat I agree with you in the sense that there are probably several distinct use-cases. Whereby some of the solutions should be simpler than other ones.

We certainly should want to avoid resorting to ENV vars though.

My own preference is leaning towards the idea that simple key storage could be achieved with something akin to ansible's "vault" mecianism. Where you have an encrypted text file held within the build context (or sources outside / alongside the build context). Then an unlock key can unlock whatever plaintext passwords or API keys etc from that file.

I'm just saying that after using anisible's own "vault" solution. Which is relatively painless / simple. Hashicorp's vault is more secure, but it is also harder to setup and just generally more complex. Although I don't know of any technical reason why you couldn't still ultimately use that underneath as the backend (hide it / simplify it behind a docker-oriented commandline tool).

I would suggest local file storage because it avoids needing to setup some complex and potentially unreliable HTTP key storage server. Secrets storage is very much a security matter so should be available all users, not just for enterprises. Just my 2 cents opinion.

blaggacao commented 9 years ago

+1 to a local file storage backend, for more advanced use cases, I would however prefer the full power of a Hashicorp Vault - like solution. When we are talking deployment, in an organisation, the argument is, that those people who provide and control secrets are other persons than those who use them. This is a common security measure to keep the circle of persons with controlling power limited to very trusted security engineers...

gtmtech commented 9 years ago

Dont know if this is any use or would work, but here's a bit of a leftfield suggestion for solving the case where I want to inject a secret into a container at runtime (e.g. a postgres password)

If I could override at docker run time the entrypoint and set it to a script of my choosing e.g. /sbin/get_secrets, which after getting secrets from a mechanism of my choosing (e.g. KMS), it would exec the original entrypoint (thus becoming a mere wrapper whose sole purpose was to set up environment variables with secrets in them INSIDE the container. Such a script could be supplied at runtime via a volume mount. Such a mechanism would not involve secrets ever being written down to disk (one of my pet hates), or being leaked by docker (not part of docker inspect), but would ensure they only exist inside the environment of process 1 inside the container, which keeps the 12-factorness.

You can already do this (I believe) if entrypoint is not used in the image metadata, but only cmd is, as entrypoint then wraps the command. As mentioned the wrapper could then be mounted at runtime via a volmount. If entrypoint is already used in the image metadata, then I think you cannot accomplish this at present unless it is possible to see what the original entrypoint was from inside the container (not the cmdline override) - not sure whether you can do that or not.

Finally it would I think even be possible to supply an encrypted one-time key via traditional env var injection with which the external /sbin/get_secrets could use to request the actual secrets (e.g. the postgres password), thus adding an extra safeguard into docker leaking the one-time key.

I cant work out if this is just layers on layers, or whether it potentially solves the issue.. apologies if just the first.

gtmtech commented 9 years ago

@thaJeztah - I can confirm the solution I pose above works, secrets are manifested w/out being leaked by docker, they exist only in-memory for process 1 via environment variables which is perfectly 12-factor compliant, but they DO NOT show up in docker api under docker inspect, or anywhere else because they are specific to process 1. Zero work is required in the image for this to work. In my case I compiled a golang static binary to do the fetching of the secrets, so it could be volume mounted and overrode the entrypoint with this, the binary issues a sys exec to transfer control to the image-defined entrypoint when finished.

kaos commented 9 years ago

@gtmtech Interesting. Would be interested in how you found out what the original entrypoint was from your get secrets binary..

dreamcat4 commented 9 years ago

Maybe an example code folder would make the aproach a bit easier to demonstrate / understand.

gtmtech commented 9 years ago

Example code and working scenarios here @dreamcat4 @kaos >

https://github.com/gtmtechltd/secret-squirrel

asokani commented 9 years ago

I may be wrong, but why these complicated methods? I rely on standard unix file permissions. Hand over all secrets to docker with -v /etc/secrets/docker1:/etc/secrets readable only by root and then there's a script running at container startup as root, which passes the secrets to appropriate places for relevant programs (for example apache config). These programs drop root permissions at startup so if hacked, they cannot read the root-owned secret later. Is this method I use somehow flawed?

kaos commented 9 years ago

Thanks @gtmtech :) Unfortunately, we have no standard entrypoint, nor am I able to run docker inspect prior to the docker run in a controlled manner.. But I like your approach.

dreamcat4 commented 9 years ago

I may be wrong, but why these complicated methods? I rely on standard unix file permissions. Hand over all secrets to docker with -v /etc/secrets/docker1:/etc/secrets readable only by root and then there's a script running at container startup as root, which passes the secrets to appropriate places for relevant programs (for example apache config). These programs drop root permissions at startup so if hacked, they cannot read the root-owned secret later. Is this method I use somehow flawed?

Hi, I agree and think this approach ^^ should be generally recommended as best way for RUNTIME secrets. Unless anybody else here has a strong objection against that. After which can then subsequently also list any remaining corner cases (at RUNTIME) which are not covered by that ^^.

Unfortunately I can't see the secret squirrel taking off because its simply too complicated for most regular non-technical persons to learn and adopt as some popular strategy.

So then that leaves (you've probably guessed it already)... Build-time secrets!

But I think thats a progress! Since after a long time not really getting anywhere, maybe cuts things in half and solves approx 45-50% of the total problem.

And if theres still remaining problems around secrets, at least they will be more specific / focussed ones and can keep progressing / takle afterwards.

gtmtech commented 9 years ago

Yep I wont go into too much details, but these approaches would never work for a situation I am currently working with, because I need a higher level of security than they provide. E.g. no secrets unencrypted on disk, no valid decryption keys once theyve been decrypted in the target process, regular encryption rotation, and single repository for encrypted secrets (and not spread across servers). So its more for people who have to do that level of security that I've suggested a possible approach.

secret_squirrel is anyway a hack in a space where I cant see any viable solutions yet, around docker not yet providing a secrets api, or a pluggable secrets-driver, which hopefully they will at some point, but perhaps it serves to illustrate that setting ENV vars inside the container before process exec, but not as part of docker create process (or metadata) is a secure way of being 12-factor compliant with secrets, and maybe the docker development community can use that idea when they start to build out a secrets-api/driver if they think its a good one!

Happy dockering!

mdub commented 9 years ago

We've been using the kind of approach that @gtmtech, describes, with great success. We inject KMS-encrypted secrets via environment variables, then let code inside the container decrypt as required.

Typically that involves a simple shim entrypoint, in front of the application. We currently implementing that shim with a combination of shell and a small Golang binary (https://github.com/realestate-com-au/shush), but I like the sound of the pure-Go approach.