Closed larsks closed 1 month ago
/CC @computate @schwesig
It looks like the ODF operator is stuck installing:
$ k get csv odf-operator.v4.15.5-rhodf
NAME DISPLAY VERSION REPLACES PHASE
odf-operator.v4.15.5-rhodf OpenShift Data Foundation 4.15.5-rhodf odf-operator.v4.15.4-rhodf Installing
My theory is that we can grab the missing ConfigMap from the production cluster, where is has this data:
apiVersion: v1
data:
CSIADDONS_SUBSCRIPTION_CATALOGSOURCE: redhat-operators
CSIADDONS_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE: openshift-marketplace
CSIADDONS_SUBSCRIPTION_CHANNEL: stable-4.15
CSIADDONS_SUBSCRIPTION_NAME: odf-csi-addons-operator
CSIADDONS_SUBSCRIPTION_PACKAGE: odf-csi-addons-operator
CSIADDONS_SUBSCRIPTION_STARTINGCSV: odf-csi-addons-operator.v4.15.5-rhodf
IBM_SUBSCRIPTION_CATALOGSOURCE: certified-operators
IBM_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE: openshift-marketplace
IBM_SUBSCRIPTION_CHANNEL: stable-v1.4
IBM_SUBSCRIPTION_NAME: ibm-storage-odf-operator
IBM_SUBSCRIPTION_PACKAGE: ibm-storage-odf-operator
IBM_SUBSCRIPTION_STARTINGCSV: ibm-storage-odf-operator.v1.4.1
NOOBAA_SUBSCRIPTION_CATALOGSOURCE: redhat-operators
NOOBAA_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE: openshift-marketplace
NOOBAA_SUBSCRIPTION_CHANNEL: stable-4.15
NOOBAA_SUBSCRIPTION_NAME: mcg-operator
NOOBAA_SUBSCRIPTION_PACKAGE: mcg-operator
NOOBAA_SUBSCRIPTION_STARTINGCSV: mcg-operator.v4.15.5-rhodf
OCS_SUBSCRIPTION_CATALOGSOURCE: redhat-operators
OCS_SUBSCRIPTION_CATALOGSOURCE_NAMESPACE: openshift-marketplace
OCS_SUBSCRIPTION_CHANNEL: stable-4.15
OCS_SUBSCRIPTION_NAME: ocs-operator
OCS_SUBSCRIPTION_PACKAGE: ocs-operator
OCS_SUBSCRIPTION_STARTINGCSV: ocs-operator.v4.15.5-rhodf
controller_manager_config.yaml: |
apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
kind: ControllerManagerConfig
health:
healthProbeBindAddress: :8081
metrics:
bindAddress: 127.0.0.1:8080
leaderElection:
leaderElect: true
resourceName: 4fd470de.openshift.io
kind: ConfigMap
metadata:
labels:
olm.managed: "true"
operators.coreos.com/odf-operator.openshift-storage: ""
name: odf-operator-manager-config
namespace: openshift-storage
It looks like we will also need the 4fd470de.openshift.io
configmap.
@schwesig is going to open a customer support case and ask (a) if they can help figure out how things go into this state in the first place, and (b) if the suggestion in my previous comment seems reasonable.
problem from earlier: https://access.redhat.com/support/cases/#/case/03861871 was kind of trigger to get deeper into that https://access.redhat.com/support/cases/#/case/03908442
odf operator update seems to be succesfull now (after maintenance restart Sept 5th). Still the nooba acm-metrica backing store is causing issues.
Update on RH support ticket: the problem seems to be rarely known. (ODF update failing)
icebox until this is solved https://github.com/nerc-project/operations/issues/745
the degradation part is solved. we are focussed now on the nooba and scaling down problem. RH support also wants us to open a new ticket because this problem is solved. therefore closing this. in case we get back to this problem after solving the others, we can reopen it/make a new one.
RH support case: https://access.redhat.com/support/cases/#/case/03908442
Thorsten and Chris were experiencing issues with the
acm-metrics-backing-store
. They deleted the pods associated with this backing store, and the pods failed to come back. Upon investigation, theodf-operator-controller-manager
pod is in a failed state:Inspecting the container statuses, we see:
And indeed, the
odf-operator-manager-config
ConfigMap does not exist.