bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.78k stars 519 forks source link

kubelet: Support running under a separate role. #1624

Open nairb774 opened 3 years ago

nairb774 commented 3 years ago

What I'd like:

To be able to run the kubelet with a different role than the role attached to the node. Through a few small changes, we have been able to remove all permissions[1] directly attached to the instance role of our EKS nodes. I'll try to give an overview of the changes that we've done, and hopefully this is something that can be supported. I will note that we have this working with out hardened EKS 1.18 node (modifications are based off of https://github.com/aws-samples/amazon-eks-custom-amis), but it likely only makes sense to actually support this on 1.20+ given https://github.com/aws/eks-distro/issues/129. The following changes would likely be necessary:

  1. A new configuration option for the kubelet role would be necessary.
  2. An AWS config file with contents looking something like:
    [profile kubelet]
    credential_source = "Ec2InstanceMetadata"
    role_arn = "${kubelet_role_arn}"
  3. The following environment variables need to be set on the kubelet: AWS_CONFIG_FILE pointing at the config file above, and AWS_PROFILE set to the profile listed in the config file.
  4. The environment for invoking aws-iam-authenticator needs to have the AWS_PROFILE set back to default so it will use the EC2 identity for authenticating with the cluster. This is needed to match the default settings applied to the kube-system/aws-auth ConfigMap.

It might be possible to do all of the assume-role configuration with a different setup, but this was what we were able to get to work.

With all of this in place, it should be possible to run the Bottlerocket VM with no permissions attached to the instance role, and instead have a second role that the kubelet uses for all of its needs. This extra layer of indirection helps to provide a little better permission isolation, and potentially some level of additional security[2].

We've had a hard time removing access to IMDS from pods (Datadog agent likes to give us trouble), so pulling the permissions to a different role prevents accidental permission use by pods in the cluster.

[1] Not exactly true given there is a role that the kubelet is assuming for operation, but this at least eliminates any easy to access permissions. [2] I will concede that it may only be security through obscurity, but if my understanding is right it would require access to both the IMDS as well as discovering or knowing the role to assume.

Any alternatives you've considered:

Any other options for being able to run the node with zero permissions directly attached to the instance role would be welcome.

nairb774 commented 3 years ago

I can understand if implementing something like this would be a lower priority. If I get time, I would be happy to poke at this as an excuse to learn the system, but I would really only start in on this if there is an indication that such a change/PR would be inline with the larger goals of the project. Any input if this is a feature that would be accepted were a PR to show up?

jhaynes commented 3 years ago

Hi @nairb774 and thanks for the follow-up and for opening this issue. You’re describing an interesting use-case that we haven’t considered before. Can you tell us a bit more about what permissions you’re assigning to the instance role and what issues you’ve had restricting access to IMDS? Our expectation is that most folks configure the instance role to be fairly locked down (e.g. the policy shown here).

One of the things you mention is access to IMDS from pods. There is some discussion on that topic on the EKS Roadmap that has some interesting options (limiting the hop count to 1 seems to do the trick). EKS also supports both pod execution roles and IAM roles for service accounts one or more of which may at least partially address your ask.

nairb774 commented 3 years ago

Our expectation is that most folks configure the instance role to be fairly locked down (e.g. the policy shown here).

You can lock it down more. For example, this is all of the policies attached to our instance role:

worker

As you can see, none of the policies you listed above are on the instance role. This role will have an empty set of policies once we have https://github.com/DataDog/datadog-agent/issues/7225 rolled out. It is also the Datadog agent that makes locking down IMDS to a hop of 1 a little tricky - it scrapes metadata from that to label logs, metrics, and traces[1]. Dropping that metadata is not preferred, so leaving IMDS available to the pods is the best we can do for now.

There is some discussion on that topic on the EKS Roadmap that has some interesting options (limiting the hop count to 1 seems to do the trick).

Yep, limiting to 1 hop would be nice, but even in that case I'd still want this to dramatically limit what is immediately available for to processes that have access to the host network[2].

EKS also supports both pod execution roles and IAM roles for service accounts one or more of which may at least partially address your ask.

Yep, we had moved arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy to IRSA before the EKS user guide had it documented. Small, additional tweaks to our custom AMI has allowed us to achieve the nearly empty policy list for our instance roles. I'd like to get off of our custom AMI and on to something which is likely more secure from an OS perspective, while at the same time preserving some of the cloud "security"[3] we've been able to achieve. Being able to tell the kubelet to assume a different role while running would likely enable us to do this.

I'm happy to both dive into additional details, and/or receive feedback on the approach.

[1] Or, at least it did last time I looked earlier this year. [2] Yes, we forbid host network for pods, but sandbox escapes are still a thing. [3] Like I noted above - we haven't eliminated the permissions, just put them on a different role. Like I said above, this is in many ways security through obscurity, but just that extra hop allows us to turn a well trodden path into a trap.

zmrow commented 3 years ago

Thanks for the additional details! We'll look into this.