ofek / csi-gcs

Kubernetes CSI driver for Google Cloud Storage
https://ofek.dev/csi-gcs/
Apache License 2.0
152 stars 39 forks source link

Do we need a liveness probe for the "csi-node-driver-registrar" container? #176

Open fradeve opened 1 year ago

fradeve commented 1 year ago

Some of my pods are suddenly losing connection with a "socket not connected" error.

I started to look into the behaviour of the csi-gcs DaemonSet and I noticed this in the logs of csi-node-driver-registrar:

Lost connection to unix:///csi/csi.sock.

which seems to be related to the "socket" error that I mentioned above.

I went on to read the docs for node-driver-registrar and noticed the following in their README: https://github.com/kubernetes-csi/node-driver-registrar#health-check-with-an-exec-probe

  containers:
    - name: csi-driver-registrar
      image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
      args:
        - "--v=5"
        - "--csi-address=/csi/csi.sock"
        - "--kubelet-registration-path=/var/lib/kubelet/plugins/<drivername.example.com>/csi.sock"
      livenessProbe:
        exec:
          command:
          - /csi-node-driver-registrar
          - --kubelet-registration-path=/var/lib/kubelet/plugins/<drivername.example.com>/csi.sock
          - --mode=kubelet-registration-probe
        initialDelaySeconds: 30
        timeoutSeconds: 15

But this is how the csi-node-driver-registrar looks like in https://github.com/ofek/csi-gcs/blob/master/deploy/base/daemonset.yaml#L22 (i.e. without a livenessprobe).

So my questions boil down to:

  1. is it worth adding a liveness probe to the csi-node-driver-registrar container?
  2. is there any specific reason for using Node-Driver-Registrar 1.2.0 when 2.8.0 is available?

As always, thanks for bringing csi-gcs to life! :pray:

fradeve commented 1 year ago

If a livenessprobe is something that can be interesting to the project, I am happy to open a PR! :handshake: