kudobuilder / kudo

Kubernetes Universal Declarative Operator (KUDO)
https://kudo.dev
Apache License 2.0
1.17k stars 103 forks source link

Operator with a dependency fails to upgrade when a new parameter is introduced #1776

Closed alembiewski closed 3 years ago

alembiewski commented 3 years ago

What happened: I developed a simple operator with a single dependency with the following structure:

operator-with-dependencies
├── dependency
│   ├── operator.yaml
│   ├── params.yaml
│   └── templates
├── operator.yaml
├── params.yaml
└── templates
    └── dependency-params.yaml

Dependency operator exposes one single parameter, that could be updated via parent instance. Both operators have deploy, update and upgrade plans defined, making it possible to implement a custom operational behavior for each operation. The Initial deployment goes well:

➜  k kudo install ./operator-with-dependencies
operator test/operator created
operator test/dependency created
operatorversion test/dependency-1.0.0-0.0.1 created
operatorversion test/operator-1.0.0-0.0.1 created
instance test/operator-instance created
➜  k get instance
NAME                           AGE
operator-instance              14s
operator-instance-dependency   14s
➜  k kudo plan status --instance operator-instance
Plan(s) for "operator-instance" in namespace "test":
.
└── operator-instance (Operator-Version: "operator-1.0.0-0.0.1" Active-Plan: "deploy")
    ├── Plan deploy (serial strategy) [COMPLETE], last updated 2021-02-25 16:35:27
    │   └── Phase deploy-dependency (serial strategy) [COMPLETE]
    │       └── Step deploy-dependency [COMPLETE]
    ├── Plan update (serial strategy) [NOT ACTIVE]
    │   └── Phase update-dependency (serial strategy) [NOT ACTIVE]
    │       └── Step update-dependency [NOT ACTIVE]
    └── Plan upgrade (serial strategy) [NOT ACTIVE]
        └── Phase upgrade-dependency (serial strategy) [NOT ACTIVE]
            └── Step upgrade-dependency [NOT ACTIVE]

➜  k kudo plan status --instance operator-instance-dependency
Plan(s) for "operator-instance-dependency" in namespace "test":
.
└── operator-instance-dependency (Operator-Version: "dependency-1.0.0-0.0.1" Active-Plan: "deploy")
    ├── Plan deploy (serial strategy) [COMPLETE], last updated 2021-02-25 16:35:27
    │   └── Phase dummy (serial strategy) [COMPLETE]
    │       └── Step dummy [COMPLETE]
    ├── Plan update (serial strategy) [NOT ACTIVE]
    │   └── Phase dummy (serial strategy) [NOT ACTIVE]
    │       └── Step dummy [NOT ACTIVE]
    └── Plan upgrade (serial strategy) [NOT ACTIVE]
        └── Phase dummy (serial strategy) [NOT ACTIVE]
            └── Step dummy [NOT ACTIVE]

Then let's say I want to extend my dependency and add another parameter and make it configurable via parent operator: https://github.com/alembiewski/operator-with-dependencies/commit/faa903742eaf1a1c76875973f37b0abb62087768

When I'm trying to do an upgrade, the plan fails with the following error:

➜  k kudo upgrade ./operator-with-dependencies --instance operator-instance
operatorversion test/dependency-1.0.0-0.0.2 created
operatorversion test/operator-1.0.0-0.0.2 created
instance test/operator-instance updated
➜  k kudo plan status --instance operator-instance
Plan(s) for "operator-instance" in namespace "test":
.
└── operator-instance (Operator-Version: "operator-1.0.0-0.0.2" Active-Plan: "upgrade")
    ├── Plan deploy (serial strategy) [NOT ACTIVE]
    │   └── Phase deploy-dependency (serial strategy) [NOT ACTIVE]
    │       └── Step deploy-dependency [NOT ACTIVE]
    ├── Plan update (serial strategy) [NOT ACTIVE]
    │   └── Phase update-dependency (serial strategy) [NOT ACTIVE]
    │       └── Step update-dependency [NOT ACTIVE]
    └── Plan upgrade (serial strategy) [IN_PROGRESS], last updated 2021-02-25 16:41:01
        └── Phase upgrade-dependency (serial strategy) [IN_PROGRESS]
            └── Step upgrade-dependency [ERROR] (A transient error when executing task upgrade.upgrade-dependency.upgrade-dependency.dependency. Will retry. admission webhook "instance-admission.kudo.dev" denied the request: failed to update Instance test/operator-instance-dependency: upgrade to new OperatorVersion {  dependency-1.0.0-0.0.2    } together with a parameter update triggering 'update' is not allowed)

What you expected to happen: Since I'm not changing any parameter value, but only adding a new one, it's not clear why the update plan gets triggered for the dependency in this case. I would expect KUDO to only trigger an upgrade plan. And how the upgrade is going to work in case when the value of the existing parameters actually changes between the versions? What is the recomended way to perform upgrades in such cases? How to reproduce it (as minimally and precisely as possible):

  1. Checkout https://github.com/alembiewski/operator-with-dependencies
  2. Install 0.0.1 version from https://github.com/alembiewski/operator-with-dependencies/commit/c84b6122d65ef28fca52c68f9b0da6223e9d5ce4
    k kudo install ./operator-with-dependencies
  3. Switch to the next version https://github.com/alembiewski/operator-with-dependencies/commit/faa903742eaf1a1c76875973f37b0abb62087768 and do the upgrade:
    k kudo upgrade ./operator-with-dependencies --instance operator-instance

    Anything else we need to know?:

Environment:

asekretenko commented 3 years ago

@alembiewski I confirm that I'm able to reproduce this; hopefully will come up with a fix proposal/PR in the near future.

asekretenko commented 3 years ago

@alembiewski After trying to come up with a simpler test, I realized that the underlying issue is more general, and affects not only dependent operators.

Currently, there is no way to upgrade an operator that has both dedicated upgrade and update plans, if the upgrade adds a required parameter without a default. One could imagine that such an update should be possible when a value for the new parameter is set together with update; however, If one tries to specify a value for the new parameter, this results in two conflicting plans being triggered. I fully agree with your suggestion that "update" should simply be not be triggered in this case, and hence this is a bug.

IMO, it is quite logical that your example in this issue is also affected by this bug: as the parameter value in your example is pulled from the parent instance, the upgrade results not only in adding a parameter to dependency's OV, but also in adding a parameter to the instance of the dependency.

Thus, my https://github.com/kudobuilder/kudo/pull/1780 tries to address the bug I outlined above.

And how the upgrade is going to work in case when the value of the existing parameters actually changes between the versions? What is the recomended way to perform upgrades in such cases?

That's an interesting point, which probably should be a new issue. Even after https://github.com/kudobuilder/kudo/pull/1780, an upgrade that changes both OVs and also the default in the parent will result in a similar failure if no value for this parameter is set manually. At this point, it is not clear to me what the desired behavior should be, because, from the point of view of the dependency, the update plan has to be triggered (the dependency should not care why its parameter changed, right?)