doc: switching from kube2iam to kiam

tasdikrahman commented 6 years ago

Would it be possible to switch to kiam without actually bringing the cluster down? I am confused on the part where the DS's are created from the example manifests, since the annotation style used is backward compatible to kube2iam, I was unsure how would you suggest going about this?

Thanks

pingles commented 6 years ago

You can definitely deploy without bringing the whole cluster down (depending on your deployment model of course- assuming you're using DaemonSets you can deploy without restarting nodes).

Having said that, in our experience there are often problems with the applications that use the metadata api- some aws client libraries have much stricter expectations on the api (and thus either the kube2iam or kiam agent processes) being available at all times. When an API call fails (even if its a background refresh thread etc.) they'll often disregard whatever credentials had been issued and start to error. How much of a problem this is depends on your apps error handling. We've somewhat forced improvements into apps as a result of kiam upgrades ;)

I'd suggest that you deploy the server processes first giving them a chance to fill their caches. Once those have started you could modify the existing kube2iam daemonset so it runs kiam instead- as long as its set to a rolling deploy you should be able to update the agents over time. There'll be a few seconds of downtime as the new images are pulled, processes started etc.

tasdikrahman commented 6 years ago

@pingles thanks for the quick reply. Yes we are running kube2iam as a DS. Was following the docs in the repo and after generating the certs by following https://github.com/uswitch/kiam/blob/master/docs/TLS.md and storing the secrets, I was looking at

https://github.com/uswitch/kiam/blob/bd05368577a5c7fda41e8f8bf12d9f7d6a079527/deploy/server.yaml#L17-L19

for deploying the server first as you suggested. However, I am not sure about the above mount. I am sure it's something dumb which I am missing out, would appreciate your help. Thanks

pingles commented 6 years ago

No worries.

We run CoreOS Container Linux so that hostPath is where all the trusted CA certificates are (not sure if its the same across all Linux distros?). The Docker image doesn't have any CA certificates included. This mounts the host's certificates so that it can interact with the AWS API over TLS.

tasdikrahman commented 6 years ago

Got it, thanks! We are running CoreOS too.

While editing the existing DS for kube2iam and then manually going in and deleting one particular pod to test kiam out, I was getting this error on the particular agent pod for kiam, we are running v2.4, our cluster has rbac enabled and the following object is created for the pod

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app: kiam
  name: kiam
rules:
  - apiGroups:
      - ""
    resources:
      - namespaces
      - pods
    verbs:
      - list
      - watch

the error we are getting was

{"addr":"10.2.5.18:55838","level":"error","method":"GET","msg":"error processing request: error checking assume role permissions: rpc error: code = Unimplemented desc = unknown method IsAllowedAssumeRole","path":"/latest/meta-data/iam/security-credentials/<app-iam-role>","status":500,"time":"2018-01-04T16:36:35Z"}

pingles commented 6 years ago

It looks like you're running an older version of the server which doesn't support an api call that the v2.4 agent uses.

Are you using v2.4 for both server and agent?

tasdikrahman commented 6 years ago

My bad, there was a version mismatch between the server and agent. This fixed the issue. Thanks a ton for your help, awesome work by you guys 😄

uswitch / kiam

doc: switching from kube2iam to kiam #21