ceph / ceph-helm

Curated applications for Kubernetes
Apache License 2.0
108 stars 36 forks source link

WIP: osd daemonset upgrade orderly #34

Closed rootfs closed 7 years ago

rootfs commented 7 years ago

In osd device daemonset, add a preStop hook handler. In the hook hanlder, watch if updated number == current number

In values.yaml, add wait_for for daemonset dev-sdd to wait for complete upgrade daemonset dev-sdc

@dmick @alram

alram commented 7 years ago

My understanding of the termination of container is that after a graceperiod (30sec default). The container will sigkill if sigterm didn't terminate it, the preStop hook just adds a 2 sec grace. (https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods). Did you try this PR with large delays?

rootfs commented 7 years ago

good point, maybe extend the grade period too

alram commented 7 years ago

I don't think extending the grace period fixes it. There are a couple of things that comes to mind.

  1. Upgrade fails, once we reach the grace period, it'll proceed to the next daemonset
  2. The grace period would need to be increased proportionally to the size of the cluster. If I have 3 storage nodes, updating a daemonset that span across these will take less time than a daemonset that span across 100 storage nodes.
rootfs commented 7 years ago

agree, setting a right grace time is tricky.