Closed tasdikrahman closed 6 years ago
You can definitely deploy without bringing the whole cluster down (depending on your deployment model of course- assuming you're using DaemonSets you can deploy without restarting nodes).
Having said that, in our experience there are often problems with the applications that use the metadata api- some aws client libraries have much stricter expectations on the api (and thus either the kube2iam or kiam agent processes) being available at all times. When an API call fails (even if its a background refresh thread etc.) they'll often disregard whatever credentials had been issued and start to error. How much of a problem this is depends on your apps error handling. We've somewhat forced improvements into apps as a result of kiam upgrades ;)
I'd suggest that you deploy the server processes first giving them a chance to fill their caches. Once those have started you could modify the existing kube2iam daemonset so it runs kiam instead- as long as its set to a rolling deploy you should be able to update the agents over time. There'll be a few seconds of downtime as the new images are pulled, processes started etc.
@pingles thanks for the quick reply. Yes we are running kube2iam as a DS. Was following the docs in the repo and after generating the certs by following https://github.com/uswitch/kiam/blob/master/docs/TLS.md and storing the secrets, I was looking at
for deploying the server first as you suggested. However, I am not sure about the above mount. I am sure it's something dumb which I am missing out, would appreciate your help. Thanks
No worries.
We run CoreOS Container Linux so that hostPath
is where all the trusted CA certificates are (not sure if its the same across all Linux distros?). The Docker image doesn't have any CA certificates included. This mounts the host's certificates so that it can interact with the AWS API over TLS.
Got it, thanks! We are running CoreOS too.
While editing the existing DS for kube2iam and then manually going in and deleting one particular pod to test kiam out, I was getting this error on the particular agent pod for kiam, we are running v2.4
,
our cluster has rbac enabled and the following object is created for the pod
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app: kiam
name: kiam
rules:
- apiGroups:
- ""
resources:
- namespaces
- pods
verbs:
- list
- watch
the error we are getting was
{"addr":"10.2.5.18:55838","level":"error","method":"GET","msg":"error processing request: error checking assume role permissions: rpc error: code = Unimplemented desc = unknown method IsAllowedAssumeRole","path":"/latest/meta-data/iam/security-credentials/<app-iam-role>","status":500,"time":"2018-01-04T16:36:35Z"}
It looks like you're running an older version of the server which doesn't support an api call that the v2.4 agent uses.
Are you using v2.4
for both server and agent?
My bad, there was a version mismatch between the server and agent. This fixed the issue. Thanks a ton for your help, awesome work by you guys 😄
Would it be possible to switch to kiam without actually bringing the cluster down? I am confused on the part where the DS's are created from the example manifests, since the annotation style used is backward compatible to kube2iam, I was unsure how would you suggest going about this?
Thanks