coreos / vault-operator

Run and manage Vault on Kubernetes simply and securely
https://coreos.com/blog/introducing-vault-operator-project
Apache License 2.0
758 stars 110 forks source link

is the IPC_LOCK capability really needed? #311

Open raffaelespazzoli opened 6 years ago

raffaelespazzoli commented 6 years ago

Vault operator creates a vault deployment requesting the IPC_LOCK capability. But in kubernetes swap is mandatorily disabled (the kubelet now doesn't start if swap is active). So, if vault can be set to run with disable_mlock=true, then the IPC_LOCK can be probably removed. This makes deployment simpler in those organization where pod security contexts (kubernetes) or scc (OpenShift) are closely scrutinized.

ncorrare commented 6 years ago

A the same time, Vault is ultimately a security product, so the idea of potentially swapping secrets that should be only available in memory is a considerable threat vector.

raffaelespazzoli commented 6 years ago

@ncorrare I did not understand your comment: In kubernetes/openshift swap is disabled. It cannot be enabled or else the cluster doesn't start. By requiring that capability you are just making the operator more difficult to deploy.

ncorrare commented 6 years ago

Yes, but Vault is not only deployed on Kubernetes/Openshift. Primarily is deployed on single tenant systems, as it’s suggested by it’s own production hardening guide (https://www.vaultproject.io/guides/operations/production.html). If the controller project wants to disable mlock by default within the build, it is an option, but not one that HashiCorp Vault should support by default.

In most organisations where HashiCorp Vault is deployed in production, is offered as a capability to a Kubernetes cluster and does not run within the cluster. Also the most commonly used backend is Consul.

raffaelespazzoli commented 6 years ago

@ncorrare I am not asking to change the way Vault works, but just the way the operator installs it. The operator will install Vault always in kubernetes/openshift, so it makes sense not to require IPC_LOCK. So I'm asking to start vault with disable_mlock=true, which will allow not to ask for the additional IPC_LOCK capability. In this example I show how to install Vault without the need of IPC_LOCK: https://github.com/raffaelespazzoli/credscontroller/tree/master/examples/spring-native-example

Right now the experience of installing Vault in OpenShift with the operator is bad (it basically doesn't work unless you create a custom scc or run in privileged mode). The same problem will be there with kubernetes as soon as pod security policies (https://kubernetes.io/docs/concepts/policy/pod-security-policy/) will get out of alpha and will be enforced.

hasbro17 commented 6 years ago

@raffaelespazzoli I agree, right now using the vault-operator on openshift or any k8s cluster with pod security policies is painful due to the IPC_LOCK requirement.

You're correct that by default the kubelet won't start if swap is enabled. Although it seems running k8s with swap enabled is still a common use case that people have workarounds for https://github.com/kubernetes/kubernetes/issues/53533#issuecomment-355526636

I think we can set disable_mlock=true and remove the IPC_LOCK capability by default since the default setting for k8s is having swap disabled. The official vault docs do state Disabling mlock is not recommended unless the systems running Vault only use encrypted swap or do not use swap at all.

For the less common case where users have swap enabled we can let them set disable_mlock=false via the Custom Resource spec. In that case the vault-pods would require the IPC_LOCK capability as they do now, and it's up to the users to create a custom SCC or bypass the pod security policy.

nthienan commented 5 years ago

Hmm, I'm experiencing the same difficulty when deploying Vault on Openshift. Does anyone know how to add custom SCC in Openshift?

riuvshin commented 5 years ago

@nthienan adding new SCC is not hard, see https://docs.openshift.com/container-platform/3.11/admin_guide/manage_scc.html the thing is that it requires cluster admin.

as @raffaelespazzoli mentioned you may just ignore this if swap is disabled.