dell / csm

Dell Container Storage Modules (CSM)
Apache License 2.0
71 stars 15 forks source link

[BUG]: Operator v1.8.0 declares Unity configVersion v2.3.0 as "unknown drive config version" #417

Closed jmyoung closed 1 year ago

jmyoung commented 2 years ago

Bug Description

Running Openshift 4.9, with the Dell-CSI-Operator installed. Verified that I do indeed have the v1.8.0 operator installed (upgrade policy is set to Automatic).

When following the directions at https://dell.github.io/csm-docs/docs/csidriver/installation/operator/#update-csi-drivers, I changed configVersion from v2.2.0 to v2.3.0 and also updated sidecar and image versions appropriately, when using the CSIUnity driver. On doing this, the operator declares;

2022-08-11T04:36:42.494Z        ERROR   controllers.CSIUnity    Validation error        {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 467, "error": "unknown driver config version"}

And that's that. I dug into the operator's container, and found the version of https://github.com/dell/dell-csi-operator/blob/v1.8.0/driverconfig/config.yaml that it's using does indeed match the v1.8.0 operator, and it does indeed include configuration for a v2.3.0 CSIUnity configVersion, but it still appears to not work.

Logs

2022-08-11T06:01:25.766Z INFO controllers.CSIUnity ################End Reconcile############## {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 6} 2022-08-11T06:02:47.653Z INFO controllers.CSIUnity Reconciling unity {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 7, "request": "dell-csi-config-test/unity-test"} 2022-08-11T06:02:47.653Z INFO controllers.CSIUnity ################Starting Reconcile############## {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 7} 2022-08-11T06:02:47.658Z INFO controller_config Reading file for default image tags {"Filename": "/etc/config/dell-csi-operator/config.yaml"} 2022-08-11T06:02:47.660Z ERROR controllers.CSIUnity Failed to initialize driver config {"csiunity": "dell-csi-config-test/unity-test", "error": "unknown driver config version"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227 2022-08-11T06:02:47.660Z ERROR controllers.CSIUnity Validation error {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 7, "error": "unknown driver config version"} github.com/dell/dell-csi-operator/controllers.(CSIUnityReconciler).Reconcile /workspace/controllers/csiunity_controller.go:84 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227 2022-08-11T06:02:47.660Z INFO controllers.CSIUnity Marking the driver status as InvalidConfig {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 7} 2022-08-11T06:02:47.660Z INFO controllers.CSIUnity No change to status. No updates will be applied to CR status {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 7}

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

  1. Have existing CSIUnity configuration
  2. Change configVersion to "v2.3.0"

Expected Behavior

I expected the upgrade to take place

CSM Driver(s)

CSI Driver for Unity XT 2.3.0

Installation Type

Operator v1.8.0

Container Storage Modules Enabled

No response

Container Orchestrator

OpenShift 4.9

Operating System

RHCOS

rensyct commented 2 years ago

Hi @jmyoung Could you help share the Unity driver manifest file which was modified

jmyoung commented 2 years ago
apiVersion: storage.dell.com/v1
kind: CSIUnity
metadata:
  name: unity-test
  namespace: dell-csi-config-test
spec:
  driver:
    configVersion: v2.3.0
    replicas: 2
    dnsPolicy: ClusterFirstWithHostNet
    common:
      image: "dellemc/csi-unity:v2.3.0"
      imagePullPolicy: IfNotPresent
    sideCars:
      - name: provisioner
        args: ["--volume-name-prefix=unityocpblue","--default-fstype=ext4"]
      - name: snapshotter
        args: ["--snapshot-name-prefix=unityocpbluesnap"]
    controller:
       nodeSelector:
         node-role.kubernetes.io/infra: ""
    node:
       envs:
          # Set to 'true' to enable full debug logging
          - name: X_CSI_DEBUG
            value: "false"

Note that this happens whether this is an upgrade (ie, by editing the existing manifest), or when creating this manifest fresh.

rensyct commented 2 years ago

Thank you @jmyoung for sending the manifest. I had few questions

  1. You had mentioned that there were changes made in sidecar versions.. I dont see those versions in the above manifest file.
  2. I feel that the manifest file had few more fields. Are some fields missing in the above manifest file
  3. Do you generally use the OCP UI or is it that only the CLI is used
  4. Please help tell the steps followed for creating this manifest fresh like is it (Uninstall driver and then install driver with new manifest or some other steps followed)
  5. Previously was the driver installed via CLI or UI?

Thanks in advance Rensy

jmyoung commented 2 years ago

Thanks for the reply, answers follow;

  1. I replicated the problem with a separate manifest in a new namespace, to ensure there wasn't any problem with my initial YAML. That's what I've given you here - the above manifest also demonstrates the problem.
  2. The above manifest was created by taking the sample unity_v230_ops_49.yaml manifest at https://github.com/dell/dell-csi-operator/blob/v1.8.0/samples/unity_v230_ops_49.yaml and substituting in relevant values.
  3. CLI only is used.
  4. Creating fresh is done by deleting the manifest, then deleting the entire namespace that the driver was deployed to (well, not deployed to in this case), then re-creating the namespace and deploying the manifest again. This ensures there are no residual components left behind in the driver's namespace that may interfere.
  5. Driver has always been deployed via CLI, never by UI.

The problem shows up with any manifest that has configVersion set to v2.3.0. Performing the equivalent process with the v2.2.0 manifest (as per https://github.com/dell/dell-csi-operator/blob/v1.8.0/samples/powerstore_v220_ops_49.yaml ) results in the operator deploying pods as one would expect.

rensyct commented 2 years ago

Thank you @jmyoung for responding to all the queries I had. Apologies for the delay in response from my side as we had a public holiday in India yesterday. Please help share the output of the below commands from your environment as I want to ensure that operator upgrade went through without issues.

  1. oc get pods -n (namespace where dell-csi-operator is installed)
  2. oc get csv -n (namespace where dell-csi-operator is installed)
  3. oc logs (dell-csi-operator-pod-name) -n (namespace where dell-csi-operator is installed)
  4. oc get pods -n (namespace where driver is installed)

Thanks in advance Rensy

jmyoung commented 2 years ago

Results appear below. Note that I've trimmed the logs due to them being quite extensive, and left the latest set of repeating lines. dell-csi-config/unity-prd corresponds to the CSIUnity manifest I attempted to upgrade, and dell-csi-config-test/unity-test corresponds to the CSIUnity manifest I created in a new namespace to replicate the issue.

$ oc get pods -n dell-csi-operator
NAME                                                    READY   STATUS    RESTARTS   AGE
dell-csi-operator-controller-manager-76c68f5866-zd8dl   1/1     Running   0          5d17h

$ oc get csv -n dell-csi-operator
NAME                               DISPLAY                    VERSION   REPLACES                           PHASE
dell-csi-operator.v1.8.0           Dell CSI Operator          1.8.0     dell-csi-operator.v1.7.0           Succeeded
gitlab-runner-operator.v1.8.0      GitLab Runner              1.8.0                                        Succeeded
openshift-gitops-operator.v1.5.5   Red Hat OpenShift GitOps   1.5.5     openshift-gitops-operator.v1.5.4   Succeeded

$ oc logs dell-csi-operator-controller-manager-76c68f5866-zd8dl -n dell-csi-operator
2022-08-11T05:42:20.338Z        INFO    cmd     Operator Version        {"Version": "1.8.0", "Commit ID": "1e3ff54a5495bc7fc75d8ba01b4c7e9db6b11940", "Commit SHA": "Fri, 10 Jun 2022 11:28:42 UTC"}
2022-08-11T05:42:20.338Z        INFO    cmd     Go Version: go1.18.3
2022-08-11T05:42:20.338Z        INFO    cmd     Go OS/Arch: linux/amd64
2022-08-11T05:42:20.345Z        INFO    cmd     Kubernetes Version: v122
I0811 05:42:21.398709       1 request.go:665] Waited for 1.04585591s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/capabilities.3scale.net/v1beta1?timeout=32s
2022-08-11T05:42:23.899Z        INFO    cmd     Detected OpenShift API groups
2022-08-11T05:42:27.454Z        INFO    controller-runtime.metrics      metrics server is starting to listen    {"addr": ":9999"}
2022-08-11T05:42:27.454Z        INFO    setup   starting manager
I0811 05:42:27.454718       1 leaderelection.go:248] attempting to acquire leader lease dell-csi-operator/7e980ba4.dell.com...
2022-08-11T05:42:27.454Z        INFO    starting metrics server {"path": "/metrics"}
I0811 05:42:44.641355       1 leaderelection.go:258] successfully acquired lease dell-csi-operator/7e980ba4.dell.com
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerMaxRevProxy  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIVXFlexOS  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIVXFlexOS  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        DEBUG   events  Normal  {"object": {"kind":"ConfigMap","namespace":"dell-csi-operator","name":"7e980ba4.dell.com","uid":"d957daee-ffcf-4678-b414-fb1ed92b798d","apiVersion":"v1","resourceVersion":"248047994"}, "reason": "LeaderElection", "message": "dell-csi-operator-controller-manager-76c68f5866-zd8dl_ca2eeff5-209e-44c9-ab3d-74df2f1edecf became leader"}
2022-08-11T05:42:44.641Z        INFO    controller.CSIVXFlexOS  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerStore        Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerStore        Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerStore        Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerMaxRevProxy  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerMaxRevProxy  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerMaxRevProxy  Starting Controller
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerStore        Starting Controller
2022-08-11T05:42:44.641Z        INFO    controller.CSIPowerMax  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIPowerMax  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIUnity     Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIUnity     Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIUnity     Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIUnity     Starting Controller
2022-08-11T05:42:44.642Z        DEBUG   events  Normal  {"object": {"kind":"Lease","namespace":"dell-csi-operator","name":"7e980ba4.dell.com","uid":"d51c81b6-6a4b-4d90-9c7d-117b120f73f4","apiVersion":"coordination.k8s.io/v1","resourceVersion":"248047996"}, "reason": "LeaderElection", "message": "dell-csi-operator-controller-manager-76c68f5866-zd8dl_ca2eeff5-209e-44c9-ab3d-74df2f1edecf became leader"}
2022-08-11T05:42:44.642Z        INFO    controller.CSIIsilon    Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIIsilon    Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIIsilon    Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIIsilon    Starting Controller
2022-08-11T05:42:44.642Z        INFO    controller.CSIVXFlexOS  Starting Controller
2022-08-11T05:42:44.642Z        INFO    controller.CSIPowerMax  Starting EventSource    {"source": "kind source: /, Kind="}
2022-08-11T05:42:44.642Z        INFO    controller.CSIPowerMax  Starting Controller
2022-08-11T05:42:44.743Z        INFO    controller.CSIUnity     Starting workers        {"worker count": 1}
...
2022-08-16T20:16:14.524Z        INFO    controllers.CSIUnity    Reconciling unity       {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60, "request": "dell-csi-config-test/unity-test"}
2022-08-16T20:16:14.524Z        INFO    controllers.CSIUnity    ################Starting Reconcile##############       {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60}2022-08-16T20:16:14.532Z        INFO    controller_config       Reading file for default image tags     {"Filename": "/etc/config/dell-csi-operator/config.yaml"}
2022-08-16T20:16:14.533Z        ERROR   controllers.CSIUnity    Failed to initialize driver config      {"csiunity": "dell-csi-config-test/unity-test", "error": "unknown driver config version"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227
2022-08-16T20:16:14.533Z        ERROR   controllers.CSIUnity    Validation error        {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60, "error": "unknown driver config version"}
github.com/dell/dell-csi-operator/controllers.(*CSIUnityReconciler).Reconcile
        /workspace/controllers/csiunity_controller.go:84
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227
2022-08-16T20:16:14.533Z        INFO    controllers.CSIUnity    Marking the driver status as InvalidConfig      {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60}
2022-08-16T20:16:14.533Z        INFO    controllers.CSIUnity    No change to status. No updates will be applied to CR status    {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60}
2022-08-16T20:16:14.533Z        ERROR   controllers.CSIUnity    *************Create/Update unity failed ********       {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60, "error": "unknown driver config version"}
github.com/dell/dell-csi-operator/controllers.(*CSIUnityReconciler).Reconcile
        /workspace/controllers/csiunity_controller.go:84
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.10.0/pkg/internal/controller/controller.go:227
2022-08-16T20:16:14.534Z        INFO    controllers.CSIUnity    ################End Reconcile##############     {"csiunity": "dell-csi-config-test/unity-test", "Namespace": "dell-csi-config-test", "Name": "unity-test", "Attempt": 60}
2022-08-16T20:16:14.534Z        INFO    controllers.CSIUnity    Reconciling unity       {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61, "request": "dell-csi-config/unity-prd"}
2022-08-16T20:16:14.534Z        INFO    controllers.CSIUnity    ################Starting Reconcile##############       {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61}
2022-08-16T20:16:14.540Z        INFO    controller_config       Reading file for default image tags     {"Filename": "/etc/config/dell-csi-operator/config.yaml"}
2022-08-16T20:16:14.541Z        INFO    controllers.CSIUnity    Driver was previously in (InvalidConfig) state  {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61}
2022-08-16T20:16:14.541Z        INFO    controllers.CSIUnity    No changes detected in the driver spec  {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61}
2022-08-16T20:16:14.541Z        INFO    controllers.CSIUnity    CR is in (InvalidConfig) state. Reconcile request won't be requeued     {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61}
2022-08-16T20:16:14.541Z        INFO    controllers.CSIUnity    ################End Reconcile##############     {"csiunity": "dell-csi-config/unity-prd", "Namespace": "dell-csi-config", "Name": "unity-prd", "Attempt": 61}

$ oc get pods -n unity-test
No resources found in unity-test namespace.
rensyct commented 2 years ago

Thank you @jmyoung for sharing the logs and output of the requested commands. Operator installation looks good.. So the issue is with the way driver was installed or upgraded As per the logs, unity driver is installed in dell-csi-config or dell-csi-config-test namespace.. I see logs for both.. Could you please help attach the output of pods that are running in these namespaces. Also please share the output of the below command as well

You had mentioned that you had also tried install of unity driver v2.3.0 in another namespace which I assume is dell-csi-config-test namespace . Please help confirm if the prerequisites (secrets) were created in that namespace.. Please help share the output of the below command too

Thanks in advance Rensy

jmyoung commented 2 years ago

The dell-csi-config container holds all the pods for the v2.2.0 driver, as would be expected - this has been trimmed for clarity due to the number of pods;

NAME                                READY   STATUS    RESTARTS        AGE
unity-controller-55ccfd94fb-bxwtb   5/5     Running   0               7d18h
unity-controller-55ccfd94fb-n5vmz   5/5     Running   0               7d18h
unity-node-29k7v                    2/2     Running   2 (7d18h ago)   7d18h
...
unity-node-zhjmf                    2/2     Running   2 (7d18h ago)   7d18h

(ie, the new config file is not being parsed because of the configVersion parameter, and so therefore the existing installation is not changed).

There is no secret in the dell-csi-config-test namespace, however the operator doesn't even get that far (ie, it refuses to parse the config because of the configVersion parameter). However, given that the presence of that configuration seems to be confusing the issue, I've deleted the namespace.

The existing unity-prd csiunity config appears below - note that the deployed pods are from the v2.2.0 version of the config, and not the v2.3.0 config showed below (I've trimmed out the repeating list of all the node numbers);

apiVersion: storage.dell.com/v1
kind: CSIUnity
metadata:
  annotations:
    storage.dell.com/CSIDriverConfigVersion: v2.2.0
    storage.dell.com/attacher.Image: k8s.gcr.io/sig-storage/csi-attacher:v3.4.0
    storage.dell.com/attacher.Image.IsDefault: "true"
    storage.dell.com/provisioner.Image: k8s.gcr.io/sig-storage/csi-provisioner:v3.1.0
    storage.dell.com/provisioner.Image.IsDefault: "true"
    storage.dell.com/registrar.Image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
    storage.dell.com/registrar.Image.IsDefault: "true"
    storage.dell.com/resizer.Image: k8s.gcr.io/sig-storage/csi-resizer:v1.4.0
    storage.dell.com/resizer.Image.IsDefault: "true"
    storage.dell.com/snapshotter.Image: k8s.gcr.io/sig-storage/csi-snapshotter:v5.0.1
    storage.dell.com/snapshotter.Image.IsDefault: "true"
  creationTimestamp: "2022-08-11T05:06:33Z"
  finalizers:
  - finalizer.dell.emc.com
  generation: 4
  labels:
    app.kubernetes.io/instance: dell-csi-config
  name: unity-prd
  namespace: dell-csi-config
  resourceVersion: "260328709"
  uid: 9b56bfd1-f9a9-4057-9c84-3bafc6cf65b9
spec:
  driver:
    common:
      image: dellemc/csi-unity:v2.3.0
      imagePullPolicy: IfNotPresent
    configVersion: v2.3.0
    controller:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
    dnsPolicy: ClusterFirstWithHostNet
    node:
      envs:
      - name: X_CSI_DEBUG
        value: "false"
    replicas: 2
    sideCars:
    - args:
      - --volume-name-prefix=unityocpblue
      - --default-fstype=ext4
      image: k8s.gcr.io/sig-storage/csi-provisioner:v3.1.0
      imagePullPolicy: IfNotPresent
      name: provisioner
    - args:
      - --snapshot-name-prefix=unityocpbluesnap
      image: k8s.gcr.io/sig-storage/csi-snapshotter:v5.0.1
      imagePullPolicy: IfNotPresent
      name: snapshotter
    - image: k8s.gcr.io/sig-storage/csi-attacher:v3.4.0
      imagePullPolicy: IfNotPresent
      name: attacher
    - image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.1
      imagePullPolicy: IfNotPresent
      name: registrar
    - image: k8s.gcr.io/sig-storage/csi-resizer:v1.4.0
      imagePullPolicy: IfNotPresent
      name: resizer
status:
  controllerStatus:
    available:
    - unity-controller-55ccfd94fb-n5vmz
    - unity-controller-55ccfd94fb-bxwtb
  driverHash: 1031264069
  lastUpdate:
    condition: InvalidConfig
    errorMessage: unknown driver config version
    time: "2022-08-11T05:34:49Z"
  nodeStatus:
    available:
    - unity-node-hj2m8
    ...
    - unity-node-9vlwx
  state: InvalidConfig
rensyct commented 2 years ago

Thank you @jmyoung for sharing the manifests and the output of the requested commands. I see that the config version in annotations are not as expected. So please help execute the below command and make the below mentioned changes Command - kubectl edit csiunity/unity-test -n (namespace where driver pods are created)

  1. Modify 5th line in the above manifest (under annotations) from storage.dell.com/CSIDriverConfigVersion: v2.2.0 to storage.dell.com/CSIDriverConfigVersion: v2.3.0
  2. Modify 10th line in the above manifest (under annotations) from storage.dell.com/registrar.Image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0 to storage.dell.com/registrar.Image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.1

Do ensure that the secrets are present in dell-csi-config namespace

Thanks in advance Rensy

rensyct commented 2 years ago

Hi @jmyoung, any updates

jmyoung commented 2 years ago

Made changes as suggested to csiunity/unity-prd in namespace dell-csi-config. No effect. Timestamp under status.lastUpdate has not changed, and condition is still InvalidConfig.

rensyct commented 2 years ago

Could we get into a call so that I could check the environment. I work in IST timezone.

rensyct commented 2 years ago

Thank you @jmyoung for joining the call late your time today. This helped resolve the issue faster. As discussed during the call, I tried an upgrade of the driver via operator using CLI. Please find the steps below

Looks like in your environment, something went wrong during the upgrade process because of which the driver pods were not restarting.

Now that the issue is resolved, please confirm if this issue can be closed. Thanks in advance Rensy