Open joaopauloksn opened 2 years ago
A few more details - and trying to be as generic as possible with the description. The problem with the replaces
happens when we try to create a subscription that results in picking a release from a single release channel in our catalog. Nothing special about the subscription - for example.
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: ibm-truststore-mgr-stable-ibm-truststore-mgr-operators-openshift-marketplace
namespace: ibm-sls
spec:
channel: stable
name: ibm-truststore-mgr
source: ibm-truststore-mgr-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Automatic
I then end up with this:
% oc get csv
NAME DISPLAY VERSION REPLACES PHASE
ibm-truststore-mgr.v1.2.3-pre.stable IBM Truststore Manager 1.2.3-pre.stable ibm-truststore-mgr.v1.2.3-pre.stable Pending
The CSV ends up with a replaces
attribute:
provider:
name: IBM
url: https://ibm.com
replaces: ibm-truststore-mgr.v1.2.3-pre.stable
version: 1.2.3-pre.stable
The CSV named ibm-truststore-mgr.v1.2.3-pre.stable
is in the state Pending
. The only status condition is:
oc get csv ibm-truststore-mgr.v1.2.3-pre.stable -o yaml
status:
cleanup: {}
conditions:
- lastTransitionTime: "2022-04-28T17:28:46Z"
lastUpdateTime: "2022-04-28T17:28:46Z"
message: requirements not yet checked
phase: Pending
reason: RequirementsUnknown
lastTransitionTime: "2022-04-28T17:28:46Z"
lastUpdateTime: "2022-04-28T17:28:46Z"
message: requirements not yet checked
phase: Pending
reason: RequirementsUnknown
Looking at the logs for the olm-operator
in the namespace openshift-operator-lifecycle-manager
the following error messages are observed:
{"level":"error","ts":1651166928.3758118,"logger":"controllers.operatorcondition","msg":"Error ensuring OperatorCondition Deployment EnvVars","request":"ibm-sls/ibm-truststore-mgr.v1.2.3-pre.stable","error":"Deployment.apps \"ibm-truststore-mgr-controller-manager\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}
{"level":"error","ts":1651166928.3759263,"logger":"controller-runtime.manager.controller.operatorcondition","msg":"Reconciler error","reconciler group":"operators.coreos.com","reconciler kind":"OperatorCondition","name":"ibm-truststore-mgr.v1.2.3-pre.stable","namespace":"ibm-sls","error":"Deployment.apps \"ibm-truststore-mgr-controller-manager\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}
time="2022-04-28T17:28:48Z" level=warning msg="Unable to replace previous CSV" csv=ibm-truststore-mgr.v1.2.3-pre.stable error="CSV being replaced is in phase Pending instead of Replacing" id=CKA7e namespace=ibm-sls phase=Pending
time="2022-04-28T17:28:49Z" level=warning msg="Unable to replace previous CSV" csv=ibm-truststore-mgr.v1.2.3-pre.stable error="CSV being replaced is in phase Pending instead of Replacing" id=oUMul namespace=ibm-sls phase=Pending
Hello @terenceq, thanks for submitting this issue and for using OLM.
For some reason, installplan status field is showing a replaces field pointing to the same CSV name, causing it to be added to my CSV. It is causing a loop as the CSV cannot replace itself.... What did add replaces field in my CSV?
This is expected behavior.
When OLM is determining if an upgrade is available for an operator, it will look at the existing CSV and determine if:
replaces
field.skips
or skipRange
field of a newer CSV.If the existing CSV has an upgrade due to the second option, the newer CSV will be have its replaces
field set to the existing CSV version. This allows OLM to use a single process for upgrading CSVs on cluster.
It is causing a loop as the CSV cannot replace itself.
This is happening because the skipRange
your using is >=1.0.0 <=99.0.0
, which is greater than the version of the CSV (v1.3.0-xxx). You need to set the skipRange
to less than the version of the CSV. In this case, it seems like you should set the skipRange
to >=1.0.0 <SEMVER
.
This issue is intermittent though. Sometimes it just works and I don't see replaces line in the CSV spec, which is very weird. How can the same catalog/channel show different deployment behaviors?
The replaces
field is only set during upgrades, I suspect you've seen a blank replaces
field when installing the operator from scratch and are not upgrading from an existing version. If this is happening at other times, please share the steps to reproduce.
Note: Edited for clarity.
Bug Report
What did you do? I created a catalog image without any replace field in the CSV.
tnoppc is the default channel and has no replaces in the CSV.
What did you expect to see? Subscription, CSV and installplan should all work as usual, installing the operator deployment.
What did you see instead? Under which circumstances? For some reason, installplan status field is showing a
replaces
field pointing to the same CSV name, causing it to be added to my CSV. It is causing a loop as the CSV cannot replace itself. This issue is intermittent though. Sometimes it just works and I don't seereplaces
line in the CSV spec, which is very weird. How can the same catalog/channel show different deployment behaviors? What did add replaces field in my CSV? It is clearly not there when I look at the installplan config map before approving it.Environment
b3aabf273e0ac0bd6e84d257332e2eac08f5e6cf
Openshift 4.8: Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6+b82a451", GitCommit:"cefce093e4e5bc9a1916eb5a489ed37c7d467f6f", GitTreeState:"clean", BuildDate:"2022-02-05T06:58:30Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Possible Solution Identify why some OLM component is adding
replaces
field to my CSV.Additional context My install plan shows
replaces
field even though I don't have it in the CSV.