Liveness probe failed: Get "http://172.30.254.5:8080/healthz/leader-election": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Describe the bug

attacher , snapshotter,provisioner,resizer pods liveness probe failed.

~# kubectl describe pod -n ibm-spectrum-scale-csi-driver ibm-spectrum-scale-csi-attacher-85c444fc7b-6xmrd 
Name:                 ibm-spectrum-scale-csi-attacher-85c444fc7b-6xmrd
Namespace:            ibm-spectrum-scale-csi-driver
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      ibm-spectrum-scale-csi-attacher
Node:                 nm129-gpu-258/10.200.100.94
Start Time:           Thu, 05 Sep 2024 13:43:30 +0800
Labels:               app=ibm-spectrum-scale-csi-attacher
                      app.kubernetes.io/instance=ibm-spectrum-scale-csi-operator
                      app.kubernetes.io/managed-by=ibm-spectrum-scale-csi-operator
                      app.kubernetes.io/name=ibm-spectrum-scale-csi-operator
                      pod-template-hash=85c444fc7b
                      product=ibm-spectrum-scale-csi
                      release=ibm-spectrum-scale-csi-operator
Annotations:          cni.projectcalico.org/containerID: 421758009bcbc4ffef300c9551a66fc223e4e986f622f0a9e3230d8929f9c015
                      cni.projectcalico.org/podIP: 172.30.112.6/32
                      cni.projectcalico.org/podIPs: 172.30.112.6/32
                      productID: ibm-spectrum-scale-csi-operator
                      productName: IBM Spectrum Scale CSI Operator
                      productVersion: 2.11.0
Status:               Running
IP:                   172.30.112.6
IPs:
  IP:           172.30.112.6
Controlled By:  ReplicaSet/ibm-spectrum-scale-csi-attacher-85c444fc7b
Containers:
  ibm-spectrum-scale-csi-attacher:
    Container ID:  containerd://3a23afdc9c9b808bf1f209c38d9bae08987d6432c9f06e133a114d197c9b43fd
    Image:         registry.k8s.io/sig-storage/csi-attacher@sha256:d69cc72025f7c40dae112ff989e920a3331583497c8dfb1600c5ae0e37184a29
    Image ID:      registry.k8s.io/sig-storage/csi-attacher@sha256:d69cc72025f7c40dae112ff989e920a3331583497c8dfb1600c5ae0e37184a29
    Port:          8080/TCP
    Host Port:     0/TCP
    Args:
      --v=5
      --csi-address=$(ADDRESS)
      --resync=10m
      --timeout=2m
      --default-fstype=gpfs
      --leader-election=true
      --leader-election-lease-duration=$(LEADER_ELECTION_LEASE_DURATION)
      --leader-election-renew-deadline=$(LEADER_ELECTION_RENEW_DEADLINE)
      --leader-election-retry-period=$(LEADER_ELECTION_RETRY_PERIOD)
      --http-endpoint=:8080
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Thu, 05 Sep 2024 14:17:21 +0800
      Finished:     Thu, 05 Sep 2024 14:18:00 +0800
    Ready:          False
    Restart Count:  15
    Limits:
      cpu:                300m
      ephemeral-storage:  5Gi
      memory:             800Mi
    Requests:
      cpu:                20m
      ephemeral-storage:  1Gi
      memory:             20Mi
    Liveness:             http-get http://:http-endpoint/healthz/leader-election delay=10s timeout=10s period=20s #success=1 #failure=1
    Environment:
      ADDRESS:                         /var/lib/kubelet/plugins/spectrumscale.csi.ibm.com/csi.sock
      LEADER_ELECTION_LEASE_DURATION:  137s
      LEADER_ELECTION_RENEW_DEADLINE:  107s
      LEADER_ELECTION_RETRY_PERIOD:    26s
    Mounts:
      /var/lib/kubelet/plugins/spectrumscale.csi.ibm.com from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xwv8d (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  socket-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/spectrumscale.csi.ibm.com
    HostPathType:  DirectoryOrCreate
  kube-api-access-xwv8d:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              scale=true
Tolerations:                 node-role.kubernetes.io/control-plane:NoSchedule op=Exists
                             node-role.kubernetes.io/infra:NoSchedule op=Exists
                             node-role.kubernetes.io/master:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                           Age                   From               Message
  ----     ------                           ----                  ----               -------
  Normal   Scheduled                        36m                   default-scheduler  Successfully assigned ibm-spectrum-scale-csi-driver/ibm-spectrum-scale-csi-attacher-85c444fc7b-6xmrd to nm129-gpu-258
  Normal   Started                          35m (x3 over 36m)     kubelet            Started container ibm-spectrum-scale-csi-attacher
  Normal   Pulled                           35m (x4 over 36m)     kubelet            Container image "registry.k8s.io/sig-storage/csi-attacher@sha256:d69cc72025f7c40dae112ff989e920a3331583497c8dfb1600c5ae0e37184a29" already present on machine
  Normal   Created                          35m (x4 over 36m)     kubelet            Created container ibm-spectrum-scale-csi-attacher
  Normal   Killing                          35m (x3 over 36m)     kubelet            Container ibm-spectrum-scale-csi-attacher failed liveness probe, will be restarted
  Warning  Unhealthy                        21m (x10 over 36m)    kubelet            Liveness probe failed: Get "http://172.30.112.6:8080/healthz/leader-election": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  FailedToRetrieveImagePullSecret  109s (x158 over 37m)  kubelet            Unable to retrieve some image pull secrets (ibm-spectrum-scale-csi-registrykey, ibm-entitlement-key); attempting to pull the image may not succeed.

kubectl logs -f  -n ibm-spectrum-scale-csi-driver ibm-spectrum-scale-csi-attacher-85c444fc7b-6xmrd 
I0905 06:17:21.349535       1 main.go:97] Version: v4.5.0
I0905 06:17:21.440016       1 connection.go:215] Connecting to unix:///var/lib/kubelet/plugins/spectrumscale.csi.ibm.com/csi.sock
I0905 06:17:21.441726       1 common.go:138] Probing CSI driver for readiness
I0905 06:17:21.441761       1 connection.go:244] GRPC call: /csi.v1.Identity/Probe
I0905 06:17:21.441772       1 connection.go:245] GRPC request: {}
I0905 06:17:21.469509       1 connection.go:251] GRPC response: {"ready":{"value":true}}
I0905 06:17:21.531898       1 connection.go:252] GRPC error: <nil>
I0905 06:17:21.531969       1 connection.go:244] GRPC call: /csi.v1.Identity/GetPluginInfo
I0905 06:17:21.531982       1 connection.go:245] GRPC request: {}
I0905 06:17:21.532818       1 connection.go:251] GRPC response: {"name":"spectrumscale.csi.ibm.com","vendor_version":"2.11.0"}
I0905 06:17:21.532848       1 connection.go:252] GRPC error: <nil>
I0905 06:17:21.532872       1 main.go:154] CSI driver name: "spectrumscale.csi.ibm.com"
I0905 06:17:21.532917       1 connection.go:244] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0905 06:17:21.532935       1 connection.go:245] GRPC request: {}
I0905 06:17:21.533032       1 main.go:180] ServeMux listening at ":8080"
I0905 06:17:21.534347       1 connection.go:251] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}}]}
I0905 06:17:21.534408       1 connection.go:252] GRPC error: <nil>
I0905 06:17:21.534429       1 connection.go:244] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0905 06:17:21.534445       1 connection.go:245] GRPC request: {}
I0905 06:17:21.535447       1 connection.go:251] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":7}}}]}
I0905 06:17:21.535475       1 connection.go:252] GRPC error: <nil>
I0905 06:17:21.535570       1 main.go:230] CSI driver supports ControllerPublishUnpublish, using real CSI handler
I0905 06:17:21.536337       1 leaderelection.go:250] attempting to acquire leader lease ibm-spectrum-scale-csi-driver/external-attacher-leader-spectrumscale-csi-ibm-com...
I0905 06:17:21.550446       1 leaderelection.go:354] lock is held by ibm-spectrum-scale-csi-attacher-744497cfff-zn5kx and has not yet expired
I0905 06:17:21.550490       1 leaderelection.go:255] failed to acquire lease ibm-spectrum-scale-csi-driver/external-attacher-leader-spectrumscale-csi-ibm-com
I0905 06:17:21.550534       1 leader_election.go:184] new leader detected, current leader: ibm-spectrum-scale-csi-attacher-744497cfff-zn5kx
I0905 06:17:52.986211       1 leaderelection.go:354] lock is held by ibm-spectrum-scale-csi-attacher-744497cfff-zn5kx and has not yet expired
I0905 06:17:52.986248       1 leaderelection.go:255] failed to acquire lease ibm-spectrum-scale-csi-driver/external-attacher-leader-spectrumscale-csi-ibm-com

How to Reproduce?

Please list the steps to help development teams reproduce the behavior

Expected behavior

A clear and concise description of what you expected to happen.

Data Collection and Debugging

Environmental output

What openshift/kubernetes version are you running, and the architecture?

~# kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.2

kubectl get pods -o wide -n < csi driver namespace>

~# kubectl get pod -n ibm-spectrum-scale-csi-driver
NAME                                                  READY   STATUS             RESTARTS         AGE
ibm-spectrum-scale-csi-2wdt4                          3/3     Running            0                36m
ibm-spectrum-scale-csi-657qc                          3/3     Running            0                36m
ibm-spectrum-scale-csi-8f9jj                          3/3     Running            0                36m
ibm-spectrum-scale-csi-8fcv9                          3/3     Running            0                36m
ibm-spectrum-scale-csi-97s5c                          3/3     Running            0                36m
ibm-spectrum-scale-csi-99nz8                          3/3     Running            0                36m
ibm-spectrum-scale-csi-9chd4                          3/3     Running            0                36m
ibm-spectrum-scale-csi-attacher-85c444fc7b-6xmrd      0/1     CrashLoopBackOff   15 (2m18s ago)   36m
ibm-spectrum-scale-csi-attacher-85c444fc7b-fl77n      0/1     CrashLoopBackOff   15 (98s ago)     36m
ibm-spectrum-scale-csi-c7x54                          3/3     Running            0                36m
ibm-spectrum-scale-csi-dcfgb                          3/3     Running            0                36m
ibm-spectrum-scale-csi-dhs5l                          3/3     Running            0                36m
ibm-spectrum-scale-csi-g7d54                          3/3     Running            0                36m
ibm-spectrum-scale-csi-j4ptl                          3/3     Running            0                36m
ibm-spectrum-scale-csi-lfgs7                          3/3     Running            0                36m
ibm-spectrum-scale-csi-operator-65fcd7866c-lv849      1/1     Running            0                18h
ibm-spectrum-scale-csi-pcbsf                          3/3     Running            0                36m
ibm-spectrum-scale-csi-provisioner-6dd44f9b4-rwmvg    0/1     CrashLoopBackOff   15 (98s ago)     36m
ibm-spectrum-scale-csi-resizer-5999c8796f-cv6qx       0/1     CrashLoopBackOff   15 (78s ago)     36m
ibm-spectrum-scale-csi-snapshotter-5c5d7fbf5b-msfrl   0/1     CrashLoopBackOff   15 (98s ago)     36m
ibm-spectrum-scale-csi-vbtfl                          3/3     Running            0                36m

kubectl get nodes -o wide

IBM Storage Scale container native version

name: ibm-spectrum-scale-csi-operator
namespace: ibm-spectrum-scale-csi-driver
labels:
app.kubernetes.io/instance: ibm-spectrum-scale-csi-operator
app.kubernetes.io/managed-by: ibm-spectrum-scale-csi-operator
app.kubernetes.io/name: ibm-spectrum-scale-csi-operator
product: ibm-spectrum-scale-csi
release: ibm-spectrum-scale-csi-operator
annotations:
productVersion: 2.11.0

IBM Storage Scale version

Output for ./tools/storage-scale-driver-snap.sh -n < csi driver namespace> -v

./tools/storage-scale-driver-snap.sh -n ibm-spectrum-scale-csi-driver -v
Operator Image                : sha256:6aa13b50f15489fb3053e7ab07a4f255019fbfe00d5ed1d05f2f8d3c0b56f531
Driver Image                  : sha256:859a5f4fb59e462b5250fa2062160ff86a3169e1a17a69bc99abb0d0c0f91ce5
Node Registrar Image          : sha256:b572095c12f71ea7db92a3c1f9279854919e0908a6e1a5d95a72182a390a76e0
Liveness Probe Image          : sha256:38ae1b6759b014a5a830012cab01e2b686f4ab2ae1efc6895b641434877c4d11
Attacher Image                : sha256:aefda5b840a960e6c7c0250df8723512ce6bc3845cc47137f30b7ad886ce1e81
Provisioner Image             : sha256:6cc68fe8a5f5d4adf967eb6d2bf0d7a647503667cdbedfb44327b060d66e90c5
Snapshotter Image             : sha256:dd171820739d3b2d8fcd1581502bca1e1d13a12afd7d7e49aa1aa3142dc20d38
Resizer Image                 : sha256:0bc2cd2fc9f26040c2001fb41b01a8bc992b30eb10a1c6ced99655090c02f8e6

Tool to collect the CSI snap:

./tools/storage-scale-driver-snap.sh -n < csi driver namespace>

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Add labels

Component:
Severity:
Customer Impact:
Customer Probability:
Phase:

Note : See labels for the labels

IBM / ibm-spectrum-scale-csi