imjasonh / ideas

A place for me to file issues against myself for things I want to build when I'm bored
5 stars 0 forks source link

K8s slow rollout controller #91

Open imjasonh opened 3 years ago

imjasonh commented 3 years ago

GitOps is cool, but YOLOing 1000 replicas of a Deployment to a new image at once is madness.

Vanilla K8s Deployments support rolling updates, which is great, but maybe you want to roll out even slower (e.g., wait 10m between rollouts), or wait on some condition or signal before continuing rollout (metrics, smoke tests, approval).

kind: RolloutPolicy
metadata:
  name: slow-rollout-policy
spec:
  stages:
  - deploy: 10%
  - wait: 30m
  - metrics:
    prometheus.something.something/error-rate: <1%
    onFailure: Rollback
  - deploy: 50%
  - wait: 30m
  - metrics:
    prometheus.something.something/error-rate: <1%
    onFailure: Rollback
  - approval: ... # make someone click a button
  - deploy: 100%
kind: Deployment
metadata:
  name: my-deployment
  annotations:
    rollout-policy: slow-rollout-policy
spec:
  ...

This is even easier for Knative Services, since you can literally just set rollout strategy to desired %age. Maybe this should just be Knative-specific at first?