openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
754 stars 109 forks source link

ERROR csi_node::filesystem_vol: Failed to publish volume .. volume is staged as "ro" but publish requires "rw" at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281 #1657

Closed todeb closed 5 months ago

todeb commented 6 months ago

Describe the bug Got an error on csi, not sure if this is a bug and why that happen. Restart of csi pod resolved the issue.

To Reproduce Steps to reproduce the behavior: don't know

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

OS info (please complete the following information): openebs.io/version: 2.6.1

Additional context Add any other context about the problem here.

last logs from csi node:

  2024-05-10T12:42:05.530483Z  INFO csi_node::filesystem_vol: Volume 6898e456-087c-481a-9b47-f4424dba7499 unpublished from /var/lib/kubelet/pods/295c9876-fbf8-4b4b-8929-bfc28199d331/volumes/kubernetes.io~csi/pvc-6898e456-087c-481a-9b47-f4424dba7499/mount
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:424

  2024-05-10T12:42:11.235052Z  INFO csi_node::node: Volume 6898e456-087c-481a-9b47-f4424dba7499 unstaged
    at control-plane/csi-driver/src/bin/node/node.rs:839

  2024-05-10T12:42:19.995951Z  INFO csi_node::filesystem_vol: Volume 6898e456-087c-481a-9b47-f4424dba7499 staged to /var/lib/kubelet/plugins/kubernetes.io/csi/io.openebs.csi-mayastor/04a21e97abfcb7c361ea631ac9aa36aea6ee00873f0ab1abdead9bc52fb8a9ae/globalmount
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:176

  2024-05-10T12:42:20.083260Z  INFO csi_node::filesystem_vol: Volume 6898e456-087c-481a-9b47-f4424dba7499 published to /var/lib/kubelet/pods/4a8940aa-4608-4ee4-8d65-1c3b9cc50f8b/volumes/kubernetes.io~csi/pvc-6898e456-087c-481a-9b47-f4424dba7499/mount
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:365

  2024-05-13T16:54:50.999500Z ERROR csi_node::registration: Failed to register app node: ServerCommunication("error in request: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution")
    at control-plane/csi-driver/src/bin/node/registration.rs:46

  2024-05-13T18:16:26.277051Z  INFO csi_node::registration: Successfully re-registered the app node
    at control-plane/csi-driver/src/bin/node/registration.rs:39

  2024-05-14T14:08:00.489859Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:00.767388Z  INFO csi_node::filesystem_vol: Volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9 unpublished from /var/lib/kubelet/pods/ae5e2ff4-e323-4102-80d4-c49022bc7b8e/volumes/kubernetes.io~csi/pvc-5e54965b-2fd6-4f7f-bea1-ecdd927225c9/mount
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:424

  2024-05-14T14:08:00.899865Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:01.507817Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:02.623529Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:04.690063Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:08.727179Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:16.847831Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281

  2024-05-14T14:08:32.958730Z ERROR csi_node::filesystem_vol: Failed to publish volume 5e54965b-2fd6-4f7f-bea1-ecdd927225c9: volume is staged as "ro" but publish requires "rw"
    at control-plane/csi-driver/src/bin/node/filesystem_vol.rs:281
tiagolobocastro commented 5 months ago

I wonder if something happened to the volume and the filesystem decided to remount itself as ro for safety. In this case you'd have to remount the volume by recreating the application IIRC

To help diagnose what has gone wrong dmesg -T from the node would help

todeb commented 5 months ago

tbh it will be pretty hard to gather logs right now. I even don't remember which node was this. But if the issue reoccur I will post the result.

tiagolobocastro commented 5 months ago

Alright, let's close for now and reopen if there's a repro.