Open cknowles opened 7 years ago
Sorry for the lag. Yes definitely kube2iam should be started before any pod scheduling is done. I will update the readme to add a note for that.
@jtblin thanks! Do you have any recommendations for achieving that? I was wondering if you were already doing similar in your projects? Perhaps we work on PR together either here and/or in the helm chart repo.
I configured all my services to terminate at launch if they don't already have proper IAM access. So when a new machine comes up, the app containers restart once or so and by then kube2iam is ready to go.
That's treated me well enough in production but a proper setup would obviously be ideal. Perhaps some automatic taint
Not really an issue with Kube2IAM or IAM roles in general, just curious how you determine if it has "proper IAM access"?
I typically deploy Kube2IAM in one "pass" of critical system level resources prior to deploying any "application" level resources because it would difficult to determine, at the pod level, if the permissions I was receiving were from the node's IAM role, kube2iam's default role or the pod's role itself.
My instance roles have almost no permissions beyond what kube2iam needs, and I don't have a default role, so the applications can't work anyway without proper IAM.
That being said, getting the IAM STS credentials from the metadata service requires getting the role name first. If you know what role name you expect, or maybe a common regex pattern, it's pretty straightforward to make sure you got a role you expect. Relevant docs: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-categories
@danopia I agree that’s how most services should work and a lot of what we have does restart as appropriate. I have seen nodes where enough services try to start before kube2iam on scale up that they fall into other issues like restart backoff. That by itself is not a problem but it does cause scale up to take quite a lot longer than needed. Hopefully whatever we come up with for kube-aws can hopefully be contributed back to this project.
@c-knowles Reading through the docs a bit more throughly to try and prepare for the CKA exam, I ran across this little tidbit in https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/#how-daemon-pods-are-scheduled
DaemonSet controller can make pods even when the scheduler has not been started, which can help cluster bootstrap.
I wonder if this may be useful for your use case. I did not see an example process for doing this immediately, but admittedly did not look very hard. I believe Kops manages a few of the process more akin to https://kubernetes.io/docs/tasks/administer-cluster/static-pod/ but as noted, there are some drawbacks here.
I'm very interested in hearing how you end up approaching this situation so keep us informed!
I accomplished this by moving the iptables configuration to the host (outside of kubelet). If kube2iam isn't started and pods try to load their IAM role (or access any other metadata), they just get a connection refused error. You can then have the pods fetch the IAM role as part of their readiness check so a pod won't be considered ready unless kube2iam is setup. You can also use an init container that fetches the IAM role and verifies that it get what it expects.
@jrnt30 thanks, it could be useful if there is some hook for that. Changing the daemon set to static pods could be another idea to solving this issue assuming they always boot fully prior to scheduling.
@rabbitfang would you have an example of the iptables rule you moved to your host config?
Did you run into issues with init container usage for services that don’t require an IAM role?
Would setting the kube2iam DaemonSet as priorityClassName: system-cluster-critical
be sufficient?
We're using the kube2iam project in the incubator project
kube-aws
and I wondered if there is any recommendation to get kube2iam to start before other pods in the case of cluster scale up? We are tracking the issue in https://github.com/kubernetes-incubator/kube-aws/issues/891.