Open thaJeztah opened 10 years ago
@gtmtech @mdub I definitely would be pleased to see more of this. @dreamcat4 I think the definition of "complicated" might be path dependent, which obviously is quite ok. Yet, it probably cannot be an abstractable judgment. Therefore, however, a security wrapper within the docker container doesn't seem something overly complicated to me at the design level. Another aspect is best practices: Those need to be looked at not from a developer-only perspective but from an operation perspective. my 2 cents
Vault +1
Vault -1. Vault has some operational characteristics (unsealing) that make it really undesirable for a lot of people.
Having a pluggable API would make the most sense.
Theres also ansible's vault. That is rather a different beast.
@gtmtech thanks for the suggestion, it inspired me to write this entrypoint:
#!/bin/bash
if [ -d "/var/secrets" ]; then
tmpfile="$(mktemp)"
for file in /var/secrets/*
do
if [ -f $file ]; then
file_contents=$(cat $file)
filename=$(basename "$file")
underscored_filename="${filename//-/_}"
capitalized_filename=${underscored_filename^^}
echo "export $capitalized_filename=$file_contents" >> $tmpfile
fi
done
source $tmpfile
rm -f $tmpfile
fi
exec "$@"
I just add it into the Dockerfile
like this (don't forget to chmod + x
on it):
ENTRYPOINT ["/app/docker-entrypoint.sh"]
And voila. ENV vars available at runtime. Good enough :)
If I understand correctly, the /var/secrets
dir should be mounted through volumes right??
Also, when there are comment about secrets not being written to disc, how bad is write them to disc and then delete them???
Nice one! You should use shred
to safely delete the file though.
On Thursday, March 3, 2016, Juan Ignacio Donoso notifications@github.com wrote:
If I understand correctly, the /var/secrets dir should be mounted through volumes right?? Also, when there are comment about secrets not being written to disc, how bad is write them to disc and then delete them???
— Reply to this email directly or view it on GitHub https://github.com/docker/docker/issues/13490#issuecomment-191887424.
Rui Marinho
Inspired by @gtmtech's "secret-squirrel", I've extended my secret-management tool "shush" to make it usable as an image entry-point:
ADD shush_linux_amd64 /usr/local/bin/shush
ENTRYPOINT ["/usr/local/bin/shush", "exec", "--"]
This decrypts any KMS_ENCRYPTED_xxx
envariables, and injects the results back into the environment.
https://github.com/realestate-com-au/shush#use-as-a-command-shim
So the thread begins with DO NOT DO ANY OF THESE THINGS.....
... but I don't see any PLEASE DO THESE THINGS INSTEAD...only various proposals/hacks that have mostly been rejected/closed.
What IS the official best-practice for now? As a docker user it's somewhat frustrating to see a long list of things we shouldn't do but then have no official alternatives offered up. Am I missing something? Does one not exist? I'm sure things are happening behind-the-scenes and that this is something that the docker team is working on, but as of right now, how do we best handle secret management until a canonical solution is presented?
@alexkolson As far as I understood, if you need secrets in runtime, you should either use volumes (filesystem secrets) or some services like HashiCorp Vault (network secrets).
For build-time secrets, it's more complicated. Volumes are not supported at build time, so you should use containers to execute commands that modify filesystem, and use docker commit.
So what's missing is an ability to manage secrets on the build time using nothing except a Dockerfile, without the need to use docker commit
.
Some people even say that using filesystem for secrets is not secure, and that docker daemon should provide some API to provide secrets securely (using network/firewall/automounted volume?). But nobody even have an idea of what this API would look like and how one would use it.
When I think of short comings of env vars, I think of non-docker specific issues such as:
The weaknesses presented at the top this thread:
Accessible by any process in the container, thus easily "leaked"
Cross apply 1 & 2 from above. Legit but addressed with being careful right? Plus, your docker container runs far fewer processes than a full stack web server.
What about config in env var, but secret env vars have encrypted values and the app has the key in code? This is just obfuscation, because the key is in code, but would require exploits to gain access to both the key and env vars. Maybe use configuration management to manage the key on the docker host rather than in the app code. May help with rouge processes and accidental leaks but obviously not injection attacks from someone who has the key.
Preserved in intermediate layers of an image, and visible in docker inspect
Are people baking env vars in to docker images rather than setting at run time or am I misunderstanding this one. Never back secrets into artifacts right? Yes sudo docker inspect container_name
gives the env var, but if your on my production server then iv already lost. sudo docker inspect image_name
does not have access my env vars set at run time.
Shared with any container linked to the container
How about don't use links and the new networking instead?
The only issue that seems like a docker issue and not universal is links...
Put me in the camp of folk who need a good way to handle secrets during docker build
. We use composer for some php projects and reference some private github repos for dependencies. This means if we want to build everything inside of containers then it needs ssh keys to access these private repos.
I've not found a good and sensible way to handle this predicament without defeating some of the other things that I find beneficial about docker (see: docker squash
).
I've now had to regress in building parts of the application outside of the container and using COPY
to bring in the final product into the container. Meh.
I think docker build
needs some functionality to handle ephemeral data like secrets so that they don't find their way into the final shipping container.
I think docker build needs some functionality to handle ephemeral data like secrets
This is a philosophical rather a technical problem. Such ephemeral data would defeat docker's essential benefit: reproducibility.
Docker's philosophy is that your Dockerfile along with a context is enough to build an image. If you need a context to be outside of resulting image, you should fetch it from network and skip writing to filesystem. Because every Dockerfile line results in a filesystem snapshot.
If secrets should not be part of an image, you could run an ephemeral container, which would mirror/proxy all your secret-protected resources and provide secret-less access. Mirroring, btw has another rationale: https://developers.slashdot.org/story/16/03/23/0652204/how-one-dev-broke-node-and-thousands-of-projects-in-11-lines-of-javascript
You can share ssh key itself as well, but you wouldn't be able to control its usage.
@bhamilton-idexx if you make sure that the authentication to your private repositories works with a short lived token you don't have to worry about the secret being persisted in the docker image. You have the build system generate a token with a ttl of 1 hour, make this available as an environment variable to the docker build. You build can fetch the required build details, but the secret times out shortly after your builds completes, closing that attack vector.
Been reading a bunch of these threads now and one feature that would solve some usecases here and would have usecases outside of secrets is a --add
flag for docker run
that copies a file into the container, just like the ADD
statement in Dockerfiles
this article is A+ http://elasticcompute.io/2016/01/21/runtime-secrets-with-docker-containers/
That is indeed a great article. Very good read. And exactly the sort of thing we have been hoping to see.
BTW:
Also found a couple of other secrets tools which seem to have been missed there from the article. Sorry for any repetitions / duplication. Didn't notice them mentioned here yet either yet:
Build time secrets:
https://github.com/defunctzombie/docket
Run time secrets:
https://github.com/ehazlett/docker-volume-libsecret
What do people think? Many thanks.
For me:
These newer tools ^^ look very good now. And they certainly didnt exist when we first started this ticket. BUT the main thing I now feel still remains missing the most:
Having a better capability for build-time secrets on the DockerHub. Which is poor there and forces an either-or choice. We must forgo the benefits of one solution for the benefits of the other one. Depending which overall set of feature(s) are more important. As local building is definately better for keeping secrets safe, but understandably worse than the Dockerhub in other ways.
We've written another tool, similar to docket, that uses the new image format:
https://github.com/AngryBytes/docker-surgery
Our implementation first creates a layer containing secrets commented SECRETS
, then creates a copy of the Dockerfile with a modified FROM
, builds and finally removes all SECRETS
layers from the resulting image.
There's always caveats to hacking this, and it'd be swell if docker had a rebasing or layer splicing functionality builtin. Removing intermediate layers right now is slow, because all solutions have to do a docker save
/ docker load
dance behind the scenes.
Furthermore build caching is broken. Right now, we use docker commit
to create the commented secrets layer, but keeping a proper cache of these layers is still a bunch of work, which we're unlikely to do. Using a Dockerfile to create the secrets layer may solve this, but there's no means of commenting the layer, making it difficult to pin-point what to remove afterwards.
@Vanuan [Dockerfile] can't have reproducibility. The RUN command guarantees that you and I cannot reasonably expect to get the exact same image out of two runs. Why? Because most of the time people use RUN to access network resources. If you want the same image as me you need to create your own image 'FROM' mine. No other arrangement will give us the same images. No other arrangement can give us the same images. All durable reproducibility comes from Docker Hub, not Dockerfile.
If the only defense for why we can't have ephemeral data is because Docker thinks they can remove all of the ephemeral data, then you have to deprecate the RUN instruction.
@stephank I've implemented a docker build tool at work that takes a slightly different approach. My main concern was not for build time secrets, but it takes care of that as well (keeping the secrets out of the built image, that is, how you get hold of the secrets in the first place is still up to you).
And that is by running a "build manager" with the project code in a VOLUME. The manager then runs any number of build tools in separate containers that mount the project code using volumes from the manager. So any built artifacts and other produced files are kept in the manager volume and follows along the build pipeline for each build step. At the end, the manager can build a final production image using the produced build result. Any secrets needed along the way have been available in the manager and/or the build containers, but not the final image. No docker image wizardry used, and build caches work as expected.
What the build pipeline looks like is entirely up to the project using a spec file configuring the build requirements.
As a matter of fact, I'm rather hyped about this tool, I'm just waiting for us to be able to release it as open source (pending company policy guidelines to be adopted)..
@kaos On the one hand, I didn't want to deviatie from the stock Docker tooling. On the other hand, I feel like there should really be some more competition among image build tools. So that sounds interesting! 😀
@thaJeztah for environment (12-factor) secrets, we're locking down the Docker daemon via Twistlock (+Scalock) to prevent leakage of environment variables via inspect. Would be great if we had a Docker-native ability to not leak as much privileged info via inspect to make this a more proper reality.
@alexkolson I think the key to this thread is "DONT DO THIS" unless you have mitigated X, Y, Z. It's clearly an engineering thead - there will always be "solutions" to common problems. That said, education on what to not do and why is important so the real workarounds can begin. The devil is always in the defaults - so we need to make sure new users understand what is at risk.
Maybe some of you guys can help me because I don't that much experience with docker yet. I used a Hashicorps Vault fetch my secrets.
What I basicly did was passing a token as a build argument and token can be used to fetch sensitive information from Vault. This happens at built time and can only succeed if the Vault is "unsealed" (open for fetching data) state. After building the used token is revoked.
But I think I'm still facing a few common issues.
It's possible to find the used token with docker inspect but it cannot be used anymore. I made the choice to seal and unseal hashicorps vault only at build time to limit access at the secrets store as much as possible. I also didn't saw an option to keep secrets save when fetching data at runtime.
So how bad did I do it (its ok to say if I messed up big time ;) ) does anyone have tips and tricks for me to make things more secure?
@weemen AFAIK storing secrets in your image is also not a good idea. Your image should have no credentials baked in (including Vault tokens). Instead, use Vault's app-id auth backend for your containers to get secrets on load time. Store them in the container's memory somehow, depending on the app stack you're using.
Also, Vault is working on an aws auth backend that will provide useful in the future if you're using AWS as a cloud provider.
@jaredm4 Can you please clarify this statement?:
"Instead, use Vault's app-id auth backend for your containers to get secrets on load time. Store them in the container's memory somehow, depending on the app stack you're using."
I'm not yet clear on when/where to retrieve the secrets from Vault (or Keywhiz, etc). Is this done before the docker run and passed to the run command? Is this happening at some point during container initialization (if so, any examples)? Should my application retrieve these when needed? For example, my rails app needs Google API keys, do I write something inside rails to call to vault when the keys are needed?
I think I'm clear on the need for using something like Vault, and clear on how to configure it, I'm just not clear on how to consume the service and get my yml files updated amd ready when rails boots.
Any guidance here would be appreciated. Thanks
Sure @mcmatthew, though I must preface by saying I'm also still trying to master Vault so my experience is pretty light.
The way I have been trying to code it is that the only info you pass to the container is something needed for your code to be able to authenticate with Vault. If you're using app-id backend, that would be the app-id
itself, and the address of your Vault.
On container boot, your Rails app will notice it doesn't have secrets yet, and must fetch them from Vault. It has the provided app-id
, and will need to somehow generate it's user-id
. This user-id generation will need to be determined by you, but their documentation hints as "it is generally a value unique to a machine, such as a MAC address or instance ID, or a value hashed from these unique values."
Once your Rails app has the app-id and user-id ready, it can then use Vault's API to /login. From there you can then make API calls to get your needed secrets.
Now to clarify what I meant about storing them in memory -- this varies depending on the type of app you're using, but with Rails there should be a way to store your secrets in a userland variable cache that will allow Rails to access the secrets from memory every request instead of getting them from Vault over and over (which as you can imagine would be slow). Take a look at this guide about caching in Rails. Namely, section 2.0, but ensuring it's using memory_cache and not disk.
Lastly, make sure that however you code it, that you do it in Rails and not with a special Docker entrypoint script or similar. Rails should detect for secrets in memory, and if not exist, fetch them.
I hope that helps. I know, a little high level, but this is how we've planned to tackle it.
What's not clear is what should be kept secret, app-id, user-id or both.
Ok, the answer is both https://www.vaultproject.io/docs/auth/app-id.html But it's still not clear why it any more secure than just plain firewalled access. Maybe it's that each host secret should be tied with application (policy) secret? I.e. if you have an access to host's secret you'd be able to access certain applications if you know their secret names?
Now we need to store 2 tokens somewhere?
@Vanuan They should both be kept as secret as possible, yes.
The app-id's main purpose is to restrict access to certain secrets inside Vault via Policies. Anyone with access to the app-id gains access to that app-id's policies' secrets. The app-id should be provided by your deployment strategy. For example, if using Chef, you could set it in the parameter bags (or CustomJSON for OpsWorks). However, on its own, it won't allow anyone access to Vault. So someone who gained access to Chef wouldn't then be able to then go access Vault.
The user-id is NOT provided by Chef, and should be tied to specific machines. If your app is redundantly scaled across instances, each instance should have its own user-id. It doesn't really matter where this user-id originates from (though they give suggestions), but it should not come from the same place that deployed the app-id (ie, Chef). As they said, it can be scripted, just through other means. Whatever software you use to scale instances could supply user-ids to the instances/docker containers and authorize the user-id to the app-id. It can also be done by hand if you don't dynamically scale your instances. Every time a human adds a new instance, they create a new user-id, authorize it to the app-id, and supply it to the instance via whatever means best suites them.
Is this better than firewalling instances? Guess that depends. Firewalling doesn't restrict access to secrets in Vault (afaik), and if someone gained access to your instances, they could easily enter your Vault.
This way, it's hard for them to get all the pieces of the puzzle. To take it one step further, app-id also allows for CIDR blocks which you should use. If someone somehow got the app-id and user-id, they still couldn't access Vault without being on that network.
(Again, this is my interpretation after grokking the documentation the best I could)
@Vanuan @mcmatthew great questions! @jaredm4 really thanks for this clarification, this will certainly help me. This is very usefull for everyone which is looking to a more practical implementation!! If I have time some where the upcoming two weeks then Ill try again!
@thaJeztah:
Accessible by any proces in the container, thus easily "leaked"
Can you support this claim? Non-privileged processes cannot access the environment variables of non-parent processes. See https://help.ubuntu.com/community/EnvironmentVariables#Process_locality.
Environment variables set for the container (via --env
or --env-file
) are accessible by any process in the container.
Of course, since they are children of the entry point process. It's the job of that process, or you in case it's e.g. a shell, to unset the secret environment variables as soon as possible.
What is more relevant is whether processes with a different user ID other than 0 can access these environment variables inside and/or outside the container. This shouldn't be the case either, when the software you use inside the container properly drops privileges.
I know it's off topic but has anyone else noticed that this issue has been active for almost a full year now! Tomorrow is its anniversary. 👏
Would it be possible for a container process to read env variables in process memory and then to un-set them (in the environment) ? Does this fix most of run-time security concerns ?
@davibe the problem with that is that if the container or its process(es) restarts, those env vars are then gone, with no way to recover them.
I tried but it looks like env vars are still there after relaunch.
dade@choo:~/work/grocerest(master)$ cat test.js
console.log("FOO value: " + process.env.FOO);
delete(process.env.FOO);
console.log("FOO value after delete: " + process.env.FOO);
dade@choo:~/work/grocerest(master)$ docker run --name test -it -e FOO=BAR -v $(pwd):/data/ node node /data/test.js
FOO value: BAR
FOO value after delete: undefined
dade@choo:~/work/grocerest(master)$ docker restart test
test
dade@choo:~/work/grocerest(master)$ docker logs test
FOO value: BAR
FOO value after delete: undefined
FOO value: BAR
FOO value after delete: undefined
maybe docker-run is executing my thing as a child of bash ? I think it should not..
@davibe:
unset 'SECRET_ENV_VAR'
I think the main problem/feature in all this is that you log into Docker as root
, thus anything you put inside a container can be inspected, be it a token, a volume, a variable, an encryption key... anything.
So one idea would be to remove sudo
and su
from your container and add a USER
command before any ENTRYPOINT
or CMD
. Anybody running your container should now get no chance to run as root
(if I'm not wrong) and thus you could now actually hide something from him.
Another idea (best IMHO) would be to add the notion of users and groups to the Docker socket and to the containers, so that you could tell GROUP-A has access to containers with TAG-B, and USER-C belongs to GROUP-A so it has access to those containers. It could even be a permission per operation (GROUP-A has access to start/stop for TAG-B, GROUP-B has access to exec, GROUP-C has access to rm/inspect, and so on).
After researching this for a few hours, I cannot believe that there seems to be no officially recommended solution or workaround for build-time secrets, and something like https://github.com/dockito/vault seems to be the only viable option for build-time secrets (short of squashing the whole resulting image or building it manually in the first place). Unfortunately https://github.com/dockito/vault is quite specific to ssh keys, so off I go to try to adapt it for hosting git https credential store files as well...
After what seems like forever (originally I heard it was slated for Q4 2015 release), AWS ECS seems to have finally come thru on their promise to bring IAM roles to docker apps. Here is the blog post as well.
Seems like this combined with some KMS goodness is a viable near term solution. In theory you just have to make the secrets bound to certain principals/IAM roles to keep non-auth roles from asking for something they shouldn't and leave safe storage to KMS.
Haven't tried it yet,but its on my short list...
Kubernetes also seems to have some secrets handling that reminds me a lot of Chef encrypted databags.
I understand this isn't the platform-indepentant OSS way that is the whole point of this thread, but wanted to throw those two options out there for people playing in those infrastructure spaces who need something NOW
I just ran across something that might help in this regard: https://github.com/docker/docker/pull/13587
This looks like it is available starting with docker v1.10.0, but I hadn't noticed it till now. I think the solution I'm leaning toward at this point is using https://www.vaultproject.io/ to store and retrieve the secrets, storing them inside the container in a tmpfs file system mounted to /secrets or something of that nature. With the new ECS feature enabling IAM roles on containers, I believe I should be able to use vault's AWS EC2 auth to secure the authorization to the secrets themselves. (For platform independent I might be inclined to go with their App ID auth.)
In any case, the missing piece for me was where to securely put the secrets once they were retrieved. The tmpfs option seems like a good one to me. The only thing missing is that ECS doesn't seem to support this parameter yet, which is why I submitted this today: https://github.com/aws/amazon-ecs-agent/issues/469
All together that seems like a pretty comprehensive solution IMHO.
@CameronGo, thanks for the pointer. If I understand correctly this can't be used at build fine though, or can it?
@NikolausDemmel sorry yes, you are correct. This is only a solution for run time secrets, not build time. In our environment, build time secrets are only used to retrieve code from Git. Jenkins handles this for us and stores the credentials for Git access. I'm not sure the same solution addresses the needs of everyone here, but I'm unclear on other use cases for build time secrets.
Jenkins handles this for us and stores the credentials for Git access.
How does that work with docker? Or do you not git clone
inside the container itself?
After reading through this issue in full, I believe it would benefit immensely from being split into separate issues for "build-time" and "run-time" secrets, which have very different requirements
If you are like me and you come here trying to decide what to do right now, then FWIW I'll describe the solution I settled on, until something better comes around.
For run-time secrets I decided to use http://kubernetes.io/docs/user-guide/secrets/. This only works if you use kubernetes. Otherwise vault looks ok. Anything secret either in generated image or temporary layer is a bad idea.
Regarding build-time secrets - I can't think of other build-time secrets use case other than distributing private code. At this point, I don't see better solution than relying on performing anything "secret" on the host side, and ADD the generated package/jar/wheel/repo/etc. to the image. Saving one LOC generating the package on the host side is not worth risking exposing ssh keys or complexity of running proxy server as suggested in some comments.
Maybe adding a "-v" flag to the docker build, similar to docker run flag could work well? It would temporarily share a directory between host and image, but also ensure it would appear empty in cache or in the generated image.
I am currently working on a solution using Vault:
It is important the the secrets are removed within the same command, so when docker caches the given layer there are no leftovers. (This of course only applies to build time secrets)
I haven't build this yet, but working on it.
Somewhat related to @kozikow 's comment: "Regarding build-time secrets - I can't think of other build-time secrets use case other than distributing private code."
Maybe not a build time secret specifically, but I have a use-case need for (securing) a password during build-time in a Dockerfile in order to allow for an already-built artifact to be downloaded via a RUN curl command. The build-time download requires user credentials to authenticate in order to grab the artifact - so we pass the password as an environment variable in the Dockerfile right now (we're still in Dev). Builds are happening behind the scenes automatically, as we use OpenShift, and environment variables in the Dockerfile are output to logs during the build, like any docker build command. This makes the password visible to anyone that has access to the logs, including our developers. I've been desperately trying to figure out a way to send the password so that it can be used during the docker build, but then not have the password output to logs or end up being in any layers.
I also second what @wpalmer said about breaking this thread into run-time and build-time.
Handling secrets (passwords, keys and related) in Docker is a recurring topic. Many pull-requests have been 'hijacked' by people wanting to (mis)use a specific feature for handling secrets.
So far, we only discourage people to use those features, because they're either provenly insecure, or not designed for handling secrets, hence "possibly" insecure. We don't offer them real alternatives, at least, not for all situations and if, then without a practical example.
I just think "secrets" is something that has been left lingering for too long. This results in users (mis)using features that are not designed for this (with the side effect that discussions get polluted with feature requests in this area) and making them jump through hoops just to be able to work with secrets.
Features / hacks that are (mis)used for secrets
This list is probably incomplete, but worth a mention
curl
-ing the secrets and remove them afterwards, all in a single layer. (also see https://github.com/dockito/vault)So, what's needed?
The above should be written / designed with both build-time and run-time secrets in mind
@calavera created a quick-and-dirty proof-of-concept on how the new Volume-Drivers (https://github.com/docker/docker/pull/13161) could be used for this; https://github.com/calavera/docker-volume-keywhiz-fs
In good tradition, here are some older proposals for handling secrets;
docker secret
storage feature" https://github.com/docker/docker/pull/6697