operator-framework / kubectl-operator

Manage Kubernetes Operators from the command line
https://operatorframework.io/
Apache License 2.0
128 stars 37 forks source link

kubectl operator upgrade refuses to upgrade cert-manager 1.4.0 to 1.4.3 #53

Closed wallrj closed 2 years ago

wallrj commented 3 years ago

If I install cert-manager-1.4.0 using kubectl operator it tells me that an upgrade is available, but when I attempt to use kubectl operator ugrade it refuses.

$ kubectl operator install cert-manager -n operators --channel stable --create-operator-group --version 1.4.0
operatorgroup "operators" created
subscription "cert-manager" created
operator "cert-manager" installed; installed csv is "cert-manager.v1.4.0"

$ kubectl operator list
PACKAGE       SUBSCRIPTION  INSTALLED CSV        CURRENT CSV          STATUS            AGE
cert-manager  cert-manager  cert-manager.v1.4.0  cert-manager.v1.4.0  UpgradeAvailable  40s

$ kubectl operator list-available cert-manager
NAME          CATALOG              CHANNEL  LATEST CSV           AGE
cert-manager  Community Operators  stable   cert-manager.v1.4.3  177m

$ kubectl operator upgrade cert-manager -n operators
failed to upgrade operator: operator is already at latest version

Uninstall

$ kubectl operator uninstall cert-manager -n operators --delete-all
subscription "cert-manager" deleted
customresourcedefinition "certificaterequests.cert-manager.io" deleted
customresourcedefinition "certificates.cert-manager.io" deleted
customresourcedefinition "challenges.acme.cert-manager.io" deleted
customresourcedefinition "clusterissuers.cert-manager.io" deleted
customresourcedefinition "issuers.cert-manager.io" deleted
customresourcedefinition "orders.acme.cert-manager.io" deleted
clusterserviceversion "cert-manager.v1.4.3" deleted
operatorgroup "operators" deleted
operator "cert-manager" uninstalled

If I install cert-manager-1.4.2, then the kubectl operator upgrade command succeeds.

$ kubectl operator install cert-manager -n operators --channel stable --create-operator-group --version 1.4.2
operatorgroup "operators" created
subscription "cert-manager" created
operator "cert-manager" installed; installed csv is "cert-manager.v1.4.2"

$ kubectl operator list
PACKAGE       SUBSCRIPTION  INSTALLED CSV        CURRENT CSV          STATUS          AGE
cert-manager  cert-manager  cert-manager.v1.4.2  cert-manager.v1.4.3  UpgradePending  45s

$ kubectl operator upgrade cert-manager -n operators
operator "cert-manager" upgraded; installed csv is "cert-manager.v1.4.3"

Perhaps I'm misunderstanding how the upgrade mechanism works.

I expected that if I install 1.4.0 and the latest version is 1.4.3, that a manual upgrade would automatically step through each of the newer versions.

joelanford commented 3 years ago

My re-creation looks a little different. Note the list status is AtLatestKnown, not UpgradeAvailable.

$ kubectl operator install cert-manager -n operators --channel stable --create-operator-group -v 1.4.0
subscription "cert-manager" created
operator "cert-manager" installed; installed csv is "cert-manager.v1.4.0"

$ kubectl operator list
PACKAGE       SUBSCRIPTION  INSTALLED CSV        CURRENT CSV          STATUS         AGE
cert-manager  cert-manager  cert-manager.v1.4.0  cert-manager.v1.4.0  AtLatestKnown  51s

Looking at the contents of the catalog, I see this:

$ opm alpha list bundles quay.io/operatorhubio/catalog:latest cert-manager
PACKAGE       CHANNEL  BUNDLE               REPLACES  SKIPS                SKIP RANGE  IMAGE
cert-manager  stable   cert-manager.v1.4.0                                             quay.io/operatorhubio/cert-manager:v1.4.0
cert-manager  stable   cert-manager.v1.4.1            cert-manager.v1.4.0              quay.io/operatorhubio/cert-manager:v1.4.1
cert-manager  stable   cert-manager.v1.4.2            cert-manager.v1.4.1              quay.io/operatorhubio/cert-manager:v1.4.2
cert-manager  stable   cert-manager.v1.4.3            cert-manager.v1.4.2              quay.io/operatorhubio/cert-manager:v1.4.3

Which is interesting to me. It looks like your operator is configured to use semver mode, so I would have expected the upgrade edges to use replaces, not skips. I tried re-creating the database with opm to see if opm is to blame for this, but it doesn't seem to be:

$ opm registry add -b quay.io/operatorhubio/cert-manager:v1.4.0,quay.io/operatorhubio/cert-manager:v1.4.1,quay.io/operatorhubio/cert-manager:v1.4.2,quay.io/operatorhubio/cert-manager:v1.4.3 --mode=semver
INFO[0000] adding to the registry                        bundles="[quay.io/operatorhubio/cert-manager:v1.4.0 quay.io/operatorhubio/cert-manager:v1.4.1 quay.io/operatorhubio/cert-manager:v1.4.2 quay.io/operatorhubio/cert-manager:v1.4.3]"
INFO[0006] Could not find optional dependencies file     file=bundle_tmp544896291/metadata load=annotations with=bundle_tmp544896291
INFO[0006] Could not find optional properties file       file=bundle_tmp544896291/metadata load=annotations with=bundle_tmp544896291
INFO[0006] Could not find optional dependencies file     file=bundle_tmp707515430/metadata load=annotations with=bundle_tmp707515430
INFO[0006] Could not find optional properties file       file=bundle_tmp707515430/metadata load=annotations with=bundle_tmp707515430
INFO[0006] Could not find optional dependencies file     file=bundle_tmp231929677/metadata load=annotations with=bundle_tmp231929677
INFO[0006] Could not find optional properties file       file=bundle_tmp231929677/metadata load=annotations with=bundle_tmp231929677
INFO[0006] Could not find optional dependencies file     file=bundle_tmp667146824/metadata load=annotations with=bundle_tmp667146824
INFO[0006] Could not find optional properties file       file=bundle_tmp667146824/metadata load=annotations with=bundle_tmp667146824

$ opm alpha list bundles bundles.db
PACKAGE       CHANNEL  BUNDLE               REPLACES             SKIPS  SKIP RANGE  IMAGE
cert-manager  stable   cert-manager.v1.4.0                                          quay.io/operatorhubio/cert-manager:v1.4.0
cert-manager  stable   cert-manager.v1.4.1  cert-manager.v1.4.0                     quay.io/operatorhubio/cert-manager:v1.4.1
cert-manager  stable   cert-manager.v1.4.2  cert-manager.v1.4.1                     quay.io/operatorhubio/cert-manager:v1.4.2
cert-manager  stable   cert-manager.v1.4.3  cert-manager.v1.4.2                     quay.io/operatorhubio/cert-manager:v1.4.3

From OLM's perspective, the root cause here is that skips are not transitive. The only bundle that has an upgrade edge from 1.4.0 is 1.4.1, but since 1.4.1 is skipped by 1.4.2, the OLM resolver eliminates 1.4.1 as a candidate, leaving no viable candidates.

This could be solves by using replaces instead of skips, or by having all the previous versions skipped by the newer versions.

Can you file an issue in github.com/k8s-operatorhub/community-operators asking why skips are being used for semver-mode instead of replaces?

joelanford commented 2 years ago

Pretty sure this was resolved in the pipeline. Closing.