openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
726 stars 105 forks source link

Not able to mount disk to the pod after attachment #1174

Closed mshantanu closed 7 months ago

mshantanu commented 2 years ago

Describe the bug After the installation of maystor on k8s 1.24 (on dedicated servers in ovh cloud) we are able to create disk pools and storage class. The pvc is getting created and getting attached to the pod however it's not able to mount, container always stay in containercreating state and throws below error,

Warning FailedMount 15s kubelet Unable to attach or mount volumes: unmounted volumes=[ms-volume], unattached volumes=[ms-volume kube-api-access-qq6j5]: timed out waiting for the condition

To Reproduce Steps to reproduce the behavior:

Expected behavior Should be able to run pod and volume should be mounted Screenshots If applicable, add screenshots to help explain your problem.

OS info (please complete the following information):

Additional context I see this in kubelet logs

[core] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/mayastor.openebs.io/csi.sock /var/lib/kubelet/plugins/mayastor.openebs.io/csi.sock 0 }: didn't receive server preface in time. Reconnecting... the netstat command show maystor is trying to connect to a socket here,

unix 2 [ ] STREAM CONNECTING 0 - /csi/csi.sock unix 2 [ ] STREAM CONNECTING 0 - /csi/csi.sock

The pod logs of the maystor-csi says -

0811 10:18:19.239968 1 connection.go:172] Still connecting to unix:///csi/csi.sock W0811 10:18:29.239956 1 connection.go:172] Still connecting to unix:///csi/csi.sock E0811 10:18:30.240699 1 connection.go:131] Lost connection to unix:///csi/csi.sock. I0811 10:18:30.243109 1 node_register.go:52] Starting Registration Server at: /registration/io.openebs.csi-mayastor-reg.sock I0811 10:18:30.245184 1 node_register.go:61] Registration Server started at: /registration/io.openebs.csi-mayastor-reg.sock I0811 10:18:30.245413 1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: "" I0811 10:18:30.977713 1 main.go:80] Received GetInfo call: &InfoRequest{} I0811 10:18:31.013769 1 main.go:90] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}

evilezh commented 2 years ago

Hello, I've quite similar issue. I went a bit further. I've installed also the latest develop branch in the hope, that things are different, but got the same result.

I was able to narrow down the issue. Once the volume was created (bdev) i was able to mount it (and use it) using the latest nvme-cli (2.1.2, did not test on older versions). I did mount on server, format, and put some files. Therefore I think we can exclude, that it might be serving side issue. Also, that means - servers, in general, are capable to work with nvmeof over tcp.

Now client side. At the moment when disk should be mounted on machine running container, this we can see in dmesg: [19233.677582] nvme nvme10: queue 0: timeout request 0x0 type 4 [19233.683856] nvme nvme10: Connect command failed, error wo/DNR bit: 881 [19233.690035] nvme nvme10: failed to connect queue: 0 ret=881

Also I can see char device /dev/nvme10 is created, but no block one /dev/nvm10n1.

Another observation: When I restart mayastor-csi, then it pods do not terminate on nodes which were just involved in creating nexuses (attaching). I have to forcibly terminate.

Also something like this comes up in logs (will check later on what basis it comes up): [17038.402901] INFO: task tokio-runtime-w:66582 blocked for more than 120 seconds. [17038.409087] Not tainted 5.10.0-16-amd64 #1 Debian 5.10.127-2 [17038.415236] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17038.421474] task:tokio-runtime-w state:D stack: 0 pid:66582 ppid: 66514 flags:0x00004000 [17038.427862] Call Trace: [17038.434094] schedule+0x282/0x870 [17038.440242] schedule+0x46/0xb0 [17038.446254] schedule_preempt_disabled+0xa/0x10 [17038.452243] mutex_lock.constprop.0+0x133/0x460 [17038.458231] ? _copy_from_user+0x28/0x60 [17038.464121] nvmf_dev_write+0x6c/0xba7 [nvme_fabrics] [17038.470011] ? kernfs_fop_release+0x52/0xa0 [17038.475875] ? apparmor_file_permission+0x69/0x150 [17038.481679] vfs_write+0xc0/0x260 [17038.487384] ksys_write+0x5f/0xe0 [17038.493055] do_syscall_64+0x33/0x80 [17038.498622] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [17038.504195] RIP: 0033:0x7f04ba7a6867 [17038.509748] RSP: 002b:00007f04ba6a1310 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [17038.515401] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f04ba7a6867 [17038.521058] RDX: 0000000000000117 RSI: 00007f04b4036cf0 RDI: 000000000000000c [17038.526747] RBP: 00007f04b4036cf0 R08: 0000000000000000 R09: 0000000000000002 [17038.532337] R10: 00007f04ba6a1178 R11: 0000000000000293 R12: 0000000000000117 [17038.537832] R13: 0000000000000117 R14: 00007f04b4036cf0 R15: 00007f04ba6a13a4

Network: bonded 802.3ad + vlan , mellanox 5.

uname -a Linux k8s-node-1 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 (2022-07-23) x86_64 GNU/Linux

lsmod | grep nvme nvmet_tcp 28672 0 nvmet 118784 1 nvmet_tcp nvme_tcp 40960 1 nvme_fabrics 32768 2 nvme_tcp configfs 57344 2 nvmet nvme 49152 9 nvme_core 126976 13 nvme_tcp,nvme,nvme_fabrics t10_pi 16384 3 nvmet,sd_mod,nvme_core

lsmod | grep mlx mlx5_ib 380928 0 ib_uverbs 167936 1 mlx5_ib ib_core 409600 2 ib_uverbs,mlx5_ib mlx5_core 1167360 1 mlx5_ib mlxfw 32768 1 mlx5_core ptp 32768 1 mlx5_core pci_hyperv_intf 16384 1 mlx5_core

evilezh commented 2 years ago

In my case - I replaced cilium with calico and everything works like swiss clock now.

So - it is something related to cilium. Can't say what.

dsharma-dc commented 1 year ago

@kumar-shantanu I understand this was reported quite some time back. Did the above change resolve things for you as well?

dhess commented 8 months ago

I'm seeing something similar with Mayastor 2.5 on Talos 1.6.1. Container events look like this:

Warning  FailedMount  2m22s (x4 over 9m28s)  kubelet            MountVolume.MountDevice failed for volume "pvc-021117b4-adb2-49ac-8ae9-e2495f838049" : rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: read unix @->/csi/csi.sock: use of closed network connection"

and kernel logs look like this:

10.0.8.22: kern:    info: [2024-01-05T03:32:37.544223409Z]: nvme nvme6: creating 2 I/O queues.
 SUBSYSTEM=nvme
 DEVICE=c246:6
10.0.8.22: kern:    info: [2024-01-05T03:32:38.058431409Z]: nvme nvme6: mapped 2/0/0 default/read/poll queues.
 SUBSYSTEM=nvme
 DEVICE=c246:6
10.0.8.22: kern:     err: [2024-01-05T03:32:38.744364409Z]: nvme nvme6: Connect Invalid Data Parameter, cntlid: 1
 SUBSYSTEM=nvme
 DEVICE=c246:6
10.0.8.22: kern:     err: [2024-01-05T03:32:39.543954409Z]: nvme nvme6: failed to connect queue: 1 ret=386
 SUBSYSTEM=nvme
 DEVICE=c246:6

I'm also running Cilium, for the record.

tiagolobocastro commented 7 months ago

Were you able to make any progress @dhess?

dhess commented 7 months ago

@tiagolobocastro I haven’t had a chance to circle back to this but will try again soon.

tiagolobocastro commented 7 months ago

Thanks @dhess, less us know if you hit this again and we'll try to debug it. Meanwhile I'll close this one as seems the original issue was related to the csi socket rather than the nvme connection. Please feel free to reopen or create a new ticket. thank you