bookingcom / shipper

Kubernetes native multi-cluster canary or blue-green rollouts using Helm
Apache License 2.0
734 stars 38 forks source link

Update capacity target in case of "In Progress" status #364

Closed hihilla closed 3 years ago

hihilla commented 3 years ago

328

At the beginning of a rollout, the installation controller will install the deployment with 0 replica set. Then the capacity controller will patch the deployment, and update the capacity targets status with a condition “Ready false, In Progress”. When a new pod finishes starting and starts spinning the containers, the deployment’s status gets a condition "Available false" (since the pod is not available yet). The capacity target condition is still "in progress" here since there is progress, the pod is starting, there are no sad pods, and no devastating statuses in the deployment.

When the pod starts failing, for example with a CrashLoop, the deployment object is not getting a new condition, and the capacity target is still reporting "in progress" (since the capacity controller is listening to deployments objects and there was no update to the deployment).

Solution: when Shipper is marking the capacity target “in progress”, shipper must re-check and take the CT out of this temporary state. We do that by raising a retriable error when the CT is "in progress" to re-enqueue the CT until the temporary status is updated.