GitOps is cool, but YOLOing 1000 replicas of a Deployment to a new image at once is madness.
Vanilla K8s Deployments support rolling updates, which is great, but maybe you want to roll out even slower (e.g., wait 10m between rollouts), or wait on some condition or signal before continuing rollout (metrics, smoke tests, approval).
webhook to intercept Deployment PodSpec image updates (like from GitOps)
set an annotation for the original value (rollout/requested-image: my/new-image)
set the image back to the original (no immediate rollout)
create a RolloutRun instance to describe this instance of a rollout, and its status
async reconciler sees the new RolloutRun
update some subset of Deployment's underlying Pods with the new image
watch for specified conditions (metrics, etc.), update RolloutRun+Deployment status to indicate why rollout is paused, emit events for rollbacks/rollforwards
This is even easier for Knative Services, since you can literally just set rollout strategy to desired %age. Maybe this should just be Knative-specific at first?
GitOps is cool, but YOLOing 1000 replicas of a Deployment to a new image at once is madness.
Vanilla K8s Deployments support rolling updates, which is great, but maybe you want to roll out even slower (e.g., wait 10m between rollouts), or wait on some condition or signal before continuing rollout (metrics, smoke tests, approval).
rollout/requested-image: my/new-image
)RolloutRun
instance to describe this instance of a rollout, and its statusRolloutRun
This is even easier for Knative Services, since you can literally just set rollout strategy to desired %age. Maybe this should just be Knative-specific at first?