NetApp / trident

Storage orchestrator for containers
Apache License 2.0
755 stars 219 forks source link

driver-registrar container fails in 24.02.0 #893

Closed brecht82 closed 5 months ago

brecht82 commented 6 months ago

Describe the bug We are currently running 23.10.0 version installed thru operator with helm chart. When updating to 24.02.0 version trident-node-linux pods fall in CrashLoopBackOff state cause of driver-registrar container fail with :

"flag provided but not defined: -kubelet-registration-path"

When I describe trident-node-linux pod I can see: driver-registrar: Args: --v=2 --csi-address=$(ADDRESS) --kubelet-registration-path=$(REGISTRATION_PATH) State: Waiting Reason: CrashLoopBackOff . . . Environment: ADDRESS: /plugin/csi.sock REGISTRATION_PATH: /var/lib/kubelet/plugins/csi.trident.netapp.io/csi.sock KUBE_NODE_NAME: (v1:spec.nodeName)

So it is defined... When rolling back to 23.10.0 version it works as expected.

Environment Provide accurate information about the environment to help us reproduce the issue.

brecht82 commented 6 months ago

Just noticed that this Trident version is trying to use csi-node-driver-registrar:v2.10.0 version mentioned for Kubernetes 1.29.0+ despite we use 1.27 and according to this page it should use 2.9.0 (https://docs.netapp.com/us-en/trident/trident-get-started/requirements.html#container-images-and-corresponding-kubernetes-versions)...

temirg commented 6 months ago

@brecht82 : Years ago we received a recommendation from Netapp support to determine the required images as follows:

  1. download and extract trident installer
  2. export KUBECONFIG
  3. cd trident-installer
  4. ./tridentctl images

For rke2 v1.27.x that we are using now:

+--------------------+---------------------------------------------------------------+
| v1.27.0            | netapp/trident:24.02.0                                        |
|                    | docker.io/netapp/trident-autosupport:24.02                    |
|                    | registry.k8s.io/sig-storage/csi-provisioner:v4.0.0            |
|                    | registry.k8s.io/sig-storage/csi-attacher:v4.5.0               |
|                    | registry.k8s.io/sig-storage/csi-resizer:v1.9.3                |
|                    | registry.k8s.io/sig-storage/csi-snapshotter:v6.3.3            |
|                    | registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0 |
|                    | netapp/trident-operator:24.02.0 (optional)                    |
+--------------------+---------------------------------------------------------------+

It has always worked reliably so far.

Regards, temirg.

brecht82 commented 6 months ago

Yes, we have all latest images ready to be deployed, trident-operator require specific versions and have it hardcoded somewhere so we cannot affect that.

vasum0406 commented 6 months ago

@brecht82 Trying to summarize updates. The issue occurs only on MKE 3.7.5 and not on RKE2 ?

brecht82 commented 5 months ago

I'm closing the issue. It was caused probably by ArgoCD, different sync option made it work.