kudobuilder / kudo

Kubernetes Universal Declarative Operator (KUDO)
https://kudo.dev
Apache License 2.0
1.18k stars 103 forks source link

Provide a way to abandon a plan #1764

Open simonvane opened 3 years ago

simonvane commented 3 years ago

What would you like to be added: Provide a way to abandon a plan.

  1. Using kubectl kudo...
  2. By editing the instance/using the k8s API.

Why is this needed: If a plan ends up with a step in an ERROR state, it is difficult to resolve e.g.

deploy:
  lastUpdatedTimestamp: "2021-01-27T18:37:56Z"
  name: deploy
  phases:
  - name: deploy-init
    status: FATAL_ERROR
    steps:
    - message: ''A transient error when executing task deploy.deploy-base-servers.deploy-appserver.appserver.
        Will retry. failed to patch a apps/v1, Kind=StatefulSet governance-im-dev/sdv-appserver:
        failed to execute patch: StatefulSet.apps "sdv-appserver" is invalid:
        spec: Forbidden: updates to statefulset spec for fields other than ''replicas'',
        ''template'', and ''updateStrategy'' are forbidden''
      name: deploy-database-init
      status: ERROR

In the case shown, this could not be recovered from without upgrading to a new version of the operator. The instance could not upgrade to the new version of the operator because the deploy plane was in progress.

Additional details: In order to get out of the situation, we edited the instance and changed spec.planExecution.status to FATAL_ERROR and were then able to upgrade and resolve the situation.

It is unclear if this was a valid approach and I cannot find any documentation that describes what to do in this situation.

What would happen with no intervention? I am also unsure if the failing step would happen with no intervention.

UPDATES