jtblin / kube2iam

kube2iam provides different AWS IAM roles for pods running on Kubernetes
BSD 3-Clause "New" or "Revised" License
1.99k stars 318 forks source link

Recommendation to get kube2iam to start before other pods #94

Open cknowles opened 7 years ago

cknowles commented 7 years ago

We're using the kube2iam project in the incubator project kube-aws and I wondered if there is any recommendation to get kube2iam to start before other pods in the case of cluster scale up? We are tracking the issue in https://github.com/kubernetes-incubator/kube-aws/issues/891.

jtblin commented 7 years ago

Sorry for the lag. Yes definitely kube2iam should be started before any pod scheduling is done. I will update the readme to add a note for that.

cknowles commented 7 years ago

@jtblin thanks! Do you have any recommendations for achieving that? I was wondering if you were already doing similar in your projects? Perhaps we work on PR together either here and/or in the helm chart repo.

danopia commented 7 years ago

I configured all my services to terminate at launch if they don't already have proper IAM access. So when a new machine comes up, the app containers restart once or so and by then kube2iam is ready to go.

That's treated me well enough in production but a proper setup would obviously be ideal. Perhaps some automatic taint

jrnt30 commented 7 years ago

Not really an issue with Kube2IAM or IAM roles in general, just curious how you determine if it has "proper IAM access"?

I typically deploy Kube2IAM in one "pass" of critical system level resources prior to deploying any "application" level resources because it would difficult to determine, at the pod level, if the permissions I was receiving were from the node's IAM role, kube2iam's default role or the pod's role itself.

danopia commented 7 years ago

My instance roles have almost no permissions beyond what kube2iam needs, and I don't have a default role, so the applications can't work anyway without proper IAM.

That being said, getting the IAM STS credentials from the metadata service requires getting the role name first. If you know what role name you expect, or maybe a common regex pattern, it's pretty straightforward to make sure you got a role you expect. Relevant docs: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-categories

cknowles commented 7 years ago

@danopia I agree that’s how most services should work and a lot of what we have does restart as appropriate. I have seen nodes where enough services try to start before kube2iam on scale up that they fall into other issues like restart backoff. That by itself is not a problem but it does cause scale up to take quite a lot longer than needed. Hopefully whatever we come up with for kube-aws can hopefully be contributed back to this project.

jrnt30 commented 7 years ago

@c-knowles Reading through the docs a bit more throughly to try and prepare for the CKA exam, I ran across this little tidbit in https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#how-daemon-pods-are-scheduled

DaemonSet controller can make pods even when the scheduler has not been started, which can help cluster bootstrap.

I wonder if this may be useful for your use case. I did not see an example process for doing this immediately, but admittedly did not look very hard. I believe Kops manages a few of the process more akin to https://kubernetes.io/docs/tasks/administer-cluster/static-pod/ but as noted, there are some drawbacks here.

I'm very interested in hearing how you end up approaching this situation so keep us informed!

rabbitfang commented 7 years ago

I accomplished this by moving the iptables configuration to the host (outside of kubelet). If kube2iam isn't started and pods try to load their IAM role (or access any other metadata), they just get a connection refused error. You can then have the pods fetch the IAM role as part of their readiness check so a pod won't be considered ready unless kube2iam is setup. You can also use an init container that fetches the IAM role and verifies that it get what it expects.

cknowles commented 7 years ago

@jrnt30 thanks, it could be useful if there is some hook for that. Changing the daemon set to static pods could be another idea to solving this issue assuming they always boot fully prior to scheduling.

timm088 commented 5 years ago

@rabbitfang would you have an example of the iptables rule you moved to your host config?

Did you run into issues with init container usage for services that don’t require an IAM role?

TwiN commented 4 years ago

Would setting the kube2iam DaemonSet as priorityClassName: system-cluster-critical be sufficient?

korjek commented 3 years ago

we started using nodetaint controller to make sure kube2iam (and some other critical DSs) are running before scheduling pods on the node