minio / directpv

Kubernetes CSI driver for Direct Attached Storage :minidisc:
https://directpv.io
GNU Affero General Public License v3.0
571 stars 85 forks source link

CrashLoopBackOff on init #919

Closed han-so1omon closed 1 month ago

han-so1omon commented 1 month ago

I’ve got a demo cluster I’m setting up to run MinIO. I’m using DirectPV, but it is failing to load on all nodes. 2 nodes failing to run the node_server pod, specifically failing to run the node_controller container. All I can find from the logs is this CrashLoopBackOff statement. Any ideas?

After: kubectl directpv init ~/.kube/k8s-homelab-directpv-drives.yaml --dangerous --timeout 10m0s

    Container ID:  containerd://78aee7fd06cfab0a23aa063e6c882f00ef7297aae0625447597f5a52b853f301
    Image:         quay.io/minio/directpv:v4.0.11
    Image ID:      quay.io/minio/directpv@sha256:81dcefa7b1f9a227e01fbf167df34657e7fdf72d11acf283912d44d6ea8f030c
    Port:          <none>
    Host Port:     <none>
    Args:
      node-controller
      -v=3
      --kube-node-name=$(KUBE_NODE_NAME)
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   I0723 22:37:53.661892  255975 reflector.go:289] Starting reflector *v1beta1.DirectPVNode (5m0s) from k8s.io/client-go@v0.28.11/tools/cache/reflector.go:229
I0723 22:37:53.662130  255975 reflector.go:325] Listing and watching *v1beta1.DirectPVNode from k8s.io/client-go@v0.28.11/tools/cache/reflector.go:229
I0723 22:37:53.762104  255975 controller.go:141] node controller synced and ready
E0723 22:37:54.247348  255975 event.go:310] "unable to create initrequest event handler" err="unable to mount; invalid argument"
E0723 22:37:54.247531  255975 main.go:147] "unable to execute command" err="initrequest controller stopped"

      Exit Code:    1
      Started:      Tue, 23 Jul 2024 15:37:53 -0700
      Finished:     Tue, 23 Jul 2024 15:37:54 -0700 
balamurugana commented 1 month ago

At the time of starting initrequest controller, it checks if XFS in the system has reflink support on a loopback device using temporary 16 MiB file. Either you do not have XFS support or XFS in your system do not support standard arguments.

Please check that and share your OS details.

han-so1omon commented 1 month ago

Here is the system details

Host OS: Ubuntu 22.04 VM Manager: LXC/LXD (using VMs not containers) Block devices: LVM partitions on host, mounted in VM as /sd{b,c,d,e} Storage device filesystems: ZFS

By default, the LXD UI allows for ZFS and Btrfs filesystems. I guess I should attach the blocks unformatted (or maybe with XFS?)

balamurugana commented 1 month ago

@han-so1omon Yep

han-so1omon commented 1 month ago

@balamurugana Is there a way to get more info on the failure? I've checked that I can create xfs filesystems on the mounted block devices in the vms

root@ubuntu-k3s-homelab-4:~# uname -a
Linux ubuntu-k3s-homelab-4 5.15.0-1062-kvm #67-Ubuntu SMP Wed Jun 19 13:44:51 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
# Create test xfs filesystem, showing reflink supported I believe
root@ubuntu-k3s-homelab-4:~# mkfs.xfs /dev/sde
meta-data=/dev/sde               isize=512    agcount=4, agsize=2752512 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0 inobtcount=0
data     =                       bsize=4096   blocks=11010048, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=5376, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
root@ubuntu-k3s-homelab-4:~# mount /dev/sde /mnt/test
root@ubuntu-k3s-homelab-4:~# df -hT /mnt/test/
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sde       xfs    42G  332M   42G   1% /mnt/test

Still same results

➜  ~ kubectl directpv discover

 Discovered node 'ubuntu-k3s-homelab-3' ✔
 Discovered node 'ubuntu-k3s-homelab-4' ✔
 Discovered node 'ubuntu-k3s-homelab-5' ✔
 Discovered node 'ubuntu-k3s-homelab-kubernetes-2' ✔

┌─────────────────────┬──────────────────────┬───────┬─────────┬────────────┬────────────────────┬───────────┬─────────────┐
│ ID                  │ NODE                 │ DRIVE │ SIZE    │ FILESYSTEM │ MAKE               │ AVAILABLE │ DESCRIPTION │
├─────────────────────┼──────────────────────┼───────┼─────────┼────────────┼────────────────────┼───────────┼─────────────┤
│ 8:0$lEwxKEUO56Nv... │ ubuntu-k3s-homelab-4 │ sda   │ 42 GiB  │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:32$guHrIB+MilW... │ ubuntu-k3s-homelab-4 │ sdc   │ 42 GiB  │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:48$6v3flv4STNc... │ ubuntu-k3s-homelab-4 │ sdd   │ 42 GiB  │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:64$2M83bdXhXgW... │ ubuntu-k3s-homelab-4 │ sde   │ 42 GiB  │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:16$xIJClQ3ZRhX... │ ubuntu-k3s-homelab-5 │ sdb   │ 190 GiB │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:32$CYsmluXJnIB... │ ubuntu-k3s-homelab-5 │ sdc   │ 190 GiB │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:48$dmx9SRkh27g... │ ubuntu-k3s-homelab-5 │ sdd   │ 190 GiB │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
│ 8:64$qbOcPZbdJTY... │ ubuntu-k3s-homelab-5 │ sde   │ 190 GiB │ -          │ QEMU QEMU_HARDDISK │ YES       │ -           │
└─────────────────────┴──────────────────────┴───────┴─────────┴────────────┴────────────────────┴───────────┴─────────────┘

➜  ~ kubectl directpv init drives.yaml --dangerous

 █████████████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  50%

 Processing initialization request '8045f27a-e55b-4f64-8a64-b25b6dd8efa5' for node 'ubuntu-k3s-homelab-5' ∙∙∙
 Processing initialization request 'e91bb963-9d67-4d7f-b05e-6d16e8765076' for node 'ubuntu-k3s-homelab-4' ∙∙∙

 Error; unable to initialize devices; context deadline exceeded 

➜  ~ kubectl logs -f node-server-vv5f9 -n directpv -c node-controller
I0725 19:01:24.975211    7952 reflector.go:289] Starting reflector *v1beta1.DirectPVNode (5m0s) from k8s.io/client-go@v0.28.11/tools/cache/reflector.go:229
I0725 19:01:24.975447    7952 reflector.go:325] Listing and watching *v1beta1.DirectPVNode from k8s.io/client-go@v0.28.11/tools/cache/reflector.go:229
I0725 19:01:25.075027    7952 controller.go:141] node controller synced and ready
E0725 19:01:25.372691    7952 event.go:310] "unable to create initrequest event handler" err="unable to mount; invalid argument"
E0725 19:01:25.372940    7952 main.go:147] "unable to execute command" err="initrequest controller stopped"
balamurugana commented 1 month ago

initrequest controller failed to start due to mount error "unable to mount; invalid argument". You should check whether XFS in your system supports noatime and prjquota mount flags.

han-so1omon commented 1 month ago

Got it. I had to recompile the kernel to enable those features. The default for lxc ubuntu 22.04 kvm and cloud optimized did not have those features enabled. Thank you