Open Jean-Baptiste-Lasselle opened 4 months ago
i changed cert manager to enabled=true, then i have a new error (topolvm complains it does not find the lvm binary at /sbin/lvm
, so i will map that volume in the kind cluster maybe..):
vagrant@debian12:~$ kubectl -n topolvm-system logs pod/topolvm-lvmd-0-nmd7q
{"level":"info","ts":"2024-06-23T22:28:04Z","msg":"configuration file loaded","device_classes":[{"name":"hdd","volume-group":"vg-decoderleco","default":false,"spare-gb":null,"stripe":null,"stripe-size":"","lvcreate-options":null,"type":"","thin-pool":null}],"socket_name":"/run/topolvm/lvmd.sock","file_name":"/etc/topolvm/lvmd.yaml"}
{"level":"info","ts":"2024-06-23T22:28:04Z","msg":"invoking command","args":["/usr/bin/nsenter","-m","-u","-i","-n","-p","-t","1","/sbin/lvm","fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"error","ts":"2024-06-23T22:28:04Z","msg":"failed to run command","error":"exit status 127: nsenter: failed to execute /sbin/lvm: No such file or directory","stacktrace":"github.com/topolvm/topolvm/internal/lvmd/command.getLVMState.func1\n\t/workdir/internal/lvmd/command/lvm_state_json.go:53\ngithub.com/topolvm/topolvm/internal/lvmd/command.getLVMState\n\t/workdir/internal/lvmd/command/lvm_state_json.go:59\ngithub.com/topolvm/topolvm/internal/lvmd/command.ListVolumeGroups\n\t/workdir/internal/lvmd/command/lvm.go:111\ngithub.com/topolvm/topolvm/cmd/lvmd/app.subMain\n\t/workdir/cmd/lvmd/app/root.go:70\ngithub.com/topolvm/topolvm/cmd/lvmd/app.init.func2\n\t/workdir/cmd/lvmd/app/root.go:51\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\ngithub.com/topolvm/topolvm/cmd/lvmd/app.Execute\n\t/workdir/cmd/lvmd/app/root.go:133\nmain.main\n\t/workdir/cmd/hypertopolvm/main.go:38\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"}
{"level":"error","ts":"2024-06-23T22:28:04Z","msg":"error while retrieving volume groups","error":"EOF","stacktrace":"github.com/topolvm/topolvm/cmd/lvmd/app.subMain\n\t/workdir/cmd/lvmd/app/root.go:72\ngithub.com/topolvm/topolvm/cmd/lvmd/app.init.func2\n\t/workdir/cmd/lvmd/app/root.go:51\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\ngithub.com/topolvm/topolvm/cmd/lvmd/app.Execute\n\t/workdir/cmd/lvmd/app/root.go:133\nmain.main\n\t/workdir/cmd/hypertopolvm/main.go:38\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"}
Error: EOF
i changed cert manager to enabled=true, then i have a new error (topolvm complains it does not find the lvm binary at
/sbin/lvm
, so i will map that volume in the kind cluster maybe..):vagrant@debian12:~$ kubectl -n topolvm-system logs pod/topolvm-lvmd-0-nmd7q {"level":"info","ts":"2024-06-23T22:28:04Z","msg":"configuration file loaded","device_classes":[{"name":"hdd","volume-group":"vg-decoderleco","default":false,"spare-gb":null,"stripe":null,"stripe-size":"","lvcreate-options":null,"type":"","thin-pool":null}],"socket_name":"/run/topolvm/lvmd.sock","file_name":"/etc/topolvm/lvmd.yaml"} {"level":"info","ts":"2024-06-23T22:28:04Z","msg":"invoking command","args":["/usr/bin/nsenter","-m","-u","-i","-n","-p","-t","1","/sbin/lvm","fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]} {"level":"error","ts":"2024-06-23T22:28:04Z","msg":"failed to run command","error":"exit status 127: nsenter: failed to execute /sbin/lvm: No such file or directory","stacktrace":"github.com/topolvm/topolvm/internal/lvmd/command.getLVMState.func1\n\t/workdir/internal/lvmd/command/lvm_state_json.go:53\ngithub.com/topolvm/topolvm/internal/lvmd/command.getLVMState\n\t/workdir/internal/lvmd/command/lvm_state_json.go:59\ngithub.com/topolvm/topolvm/internal/lvmd/command.ListVolumeGroups\n\t/workdir/internal/lvmd/command/lvm.go:111\ngithub.com/topolvm/topolvm/cmd/lvmd/app.subMain\n\t/workdir/cmd/lvmd/app/root.go:70\ngithub.com/topolvm/topolvm/cmd/lvmd/app.init.func2\n\t/workdir/cmd/lvmd/app/root.go:51\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\ngithub.com/topolvm/topolvm/cmd/lvmd/app.Execute\n\t/workdir/cmd/lvmd/app/root.go:133\nmain.main\n\t/workdir/cmd/hypertopolvm/main.go:38\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"} {"level":"error","ts":"2024-06-23T22:28:04Z","msg":"error while retrieving volume groups","error":"EOF","stacktrace":"github.com/topolvm/topolvm/cmd/lvmd/app.subMain\n\t/workdir/cmd/lvmd/app/root.go:72\ngithub.com/topolvm/topolvm/cmd/lvmd/app.init.func2\n\t/workdir/cmd/lvmd/app/root.go:51\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\ngithub.com/topolvm/topolvm/cmd/lvmd/app.Execute\n\t/workdir/cmd/lvmd/app/root.go:133\nmain.main\n\t/workdir/cmd/hypertopolvm/main.go:38\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"} Error: EOF
ok, yeah, first in the kind cluster yaml config there are some special volume mounts, and plus, in the kind example make file, there is, on the host, a specific setup of lvmd configuring a systemV unit, which involves a unix socket which i found mentioned in some eror logs, look:
vagrant@debian12:~$ kubectl -n topolvm-system logs pod/topolvm-node-tx52m -c topolvm-node
{"level":"info","ts":"2024-06-23T22:38:19Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2024-06-23T22:38:19Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-06-23T22:38:19Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Starting EventSource","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume","source":"kind source: *v1.LogicalVolume"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Starting Controller","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Starting workers","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume","worker count":1}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Stopping and waiting for leader election runnables"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"All workers finished","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Stopping and waiting for caches"}
W0623 22:38:19.347174 1 reflector.go:462] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: watch of *v1.LogicalVolume ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Stopping and waiting for webhooks"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Stopping and waiting for HTTP servers"}
{"level":"info","ts":"2024-06-23T22:38:19Z","logger":"controller-runtime.metrics","msg":"Shutting down metrics server with timeout of 1 minute"}
{"level":"info","ts":"2024-06-23T22:38:19Z","msg":"Wait completed, proceeding to shutdown the manager"}
{"level":"error","ts":"2024-06-23T22:38:19Z","logger":"setup","msg":"problem running manager","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial unix /run/topolvm/lvmd.sock: connect: no such file or directory\"","stacktrace":"github.com/topolvm/topolvm/cmd/topolvm-node/app.subMain\n\t/workdir/cmd/topolvm-node/app/run.go:168\ngithub.com/topolvm/topolvm/cmd/topolvm-node/app.init.func1\n\t/workdir/cmd/topolvm-node/app/root.go:40\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\ngithub.com/topolvm/topolvm/cmd/topolvm-node/app.Execute\n\t/workdir/cmd/topolvm-node/app/root.go:47\nmain.main\n\t/workdir/cmd/hypertopolvm/main.go:42\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"}
Error: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /run/topolvm/lvmd.sock: connect: no such file or directory"
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /run/topolvm/lvmd.sock: connect: no such file or directory"
vagrant@debian12:~$
the unix socket is at /run/topolvm/lvmd.sock
, and look in the kind cluster yaml config this path is mentioned: https://github.com/topolvm/topolvm/blob/f6b7b2e45f4798497b0a3fb590aac8c2a026622c/example/kind/topolvm-cluster.yaml#L21C27-L21C34
Okay we're almost ther, its only the lvmd pods that are still complaining the /sbin/lvm
executable is not found (will it be solved if i set lvmd.managed
to false
for the helm chart values ?):
vagrant@debian12:~$ kubectl -n topolvm-system get all
NAME READY STATUS RESTARTS AGE
pod/topolvm-cert-manager-657b7864b7-hdr55 1/1 Running 0 22m
pod/topolvm-cert-manager-cainjector-57fbb46b78-ctjbg 1/1 Running 0 22m
pod/topolvm-cert-manager-startupapicheck-mjz8g 0/1 Completed 0 22m
pod/topolvm-cert-manager-webhook-85bff86bcc-zwlmc 1/1 Running 0 22m
pod/topolvm-controller-5dd4b498d9-6rjsg 5/5 Running 0 22m
pod/topolvm-controller-5dd4b498d9-9cwrj 5/5 Running 0 22m
pod/topolvm-lvmd-0-2r5h4 0/1 CrashLoopBackOff 8 (2m27s ago) 22m
pod/topolvm-lvmd-0-9xmsr 0/1 CrashLoopBackOff 8 (2m46s ago) 22m
pod/topolvm-lvmd-0-pm2jc 0/1 CrashLoopBackOff 8 (2m43s ago) 22m
pod/topolvm-lvmd-0-rdplf 0/1 CrashLoopBackOff 8 (2m40s ago) 22m
pod/topolvm-lvmd-0-vrrbm 0/1 CrashLoopBackOff 8 (2m28s ago) 22m
pod/topolvm-lvmd-0-xhnv4 0/1 CrashLoopBackOff 8 (2m51s ago) 22m
pod/topolvm-lvmd-0-xznk7 0/1 CrashLoopBackOff 8 (2m37s ago) 22m
pod/topolvm-node-4jhm9 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-6lbsd 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-6w9kz 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-cz6gt 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-jm777 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-kw8f9 3/3 Running 2 (18m ago) 22m
pod/topolvm-node-rnzzj 3/3 Running 2 (18m ago) 22m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/topolvm-cert-manager ClusterIP 10.96.251.97 <none> 9402/TCP 22m
service/topolvm-cert-manager-webhook ClusterIP 10.96.73.23 <none> 443/TCP 22m
service/topolvm-controller ClusterIP 10.96.161.35 <none> 443/TCP 22m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/topolvm-lvmd-0 7 7 0 7 0 <none> 22m
daemonset.apps/topolvm-node 7 7 7 7 7 <none> 22m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/topolvm-cert-manager 1/1 1 1 22m
deployment.apps/topolvm-cert-manager-cainjector 1/1 1 1 22m
deployment.apps/topolvm-cert-manager-webhook 1/1 1 1 22m
deployment.apps/topolvm-controller 2/2 2 2 22m
NAME DESIRED CURRENT READY AGE
replicaset.apps/topolvm-cert-manager-657b7864b7 1 1 1 22m
replicaset.apps/topolvm-cert-manager-cainjector-57fbb46b78 1 1 1 22m
replicaset.apps/topolvm-cert-manager-webhook-85bff86bcc 1 1 1 22m
replicaset.apps/topolvm-controller-5dd4b498d9 2 2 2 22m
NAME STATUS COMPLETIONS DURATION AGE
job.batch/topolvm-cert-manager-startupapicheck Complete 1/1 17m 22m
vagrant@debian12:~$
vagrant@debian12:~$ kubectl -n topolvm-system logs pod/topolvm-node-kw8f9
Defaulted container "topolvm-node" out of: topolvm-node, csi-registrar, liveness-probe
{"level":"info","ts":"2024-06-26T18:04:11Z","logger":"setup","msg":"starting manager"}
{"level":"info","ts":"2024-06-26T18:04:11Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-06-26T18:04:11Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
{"level":"info","ts":"2024-06-26T18:04:11Z","msg":"Starting EventSource","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume","source":"kind source: *v1.LogicalVolume"}
{"level":"info","ts":"2024-06-26T18:04:11Z","msg":"Starting Controller","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume"}
{"level":"info","ts":"2024-06-26T18:04:11Z","msg":"Starting workers","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume","worker count":1}
vagrant@debian12:~$
ok, in lvmd.managed=false
mode, now topolvm works:
vagrant@debian12:~$ kubectl -n topolvm-system get all
NAME READY STATUS RESTARTS AGE
pod/topolvm-cert-manager-657b7864b7-5cvnp 1/1 Running 0 29m
pod/topolvm-cert-manager-cainjector-57fbb46b78-2tq8f 1/1 Running 0 29m
pod/topolvm-cert-manager-startupapicheck-vxwsz 0/1 Completed 0 29m
pod/topolvm-cert-manager-webhook-85bff86bcc-pd5mp 1/1 Running 0 29m
pod/topolvm-controller-5dd4b498d9-7zhg6 5/5 Running 0 29m
pod/topolvm-controller-5dd4b498d9-qrhp8 5/5 Running 0 29m
pod/topolvm-node-64lrg 3/3 Running 0 29m
pod/topolvm-node-kcrk9 3/3 Running 0 29m
pod/topolvm-node-ng5dw 3/3 Running 0 29m
pod/topolvm-node-snttw 3/3 Running 0 29m
pod/topolvm-node-vhsvd 3/3 Running 0 29m
pod/topolvm-node-w6sfs 3/3 Running 0 29m
pod/topolvm-node-wmnzw 3/3 Running 0 29m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/topolvm-cert-manager ClusterIP 10.96.54.93 <none> 9402/TCP 29m
service/topolvm-cert-manager-webhook ClusterIP 10.96.177.141 <none> 443/TCP 29m
service/topolvm-controller ClusterIP 10.96.136.171 <none> 443/TCP 29m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/topolvm-node 7 7 7 7 7 <none> 29m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/topolvm-cert-manager 1/1 1 1 29m
deployment.apps/topolvm-cert-manager-cainjector 1/1 1 1 29m
deployment.apps/topolvm-cert-manager-webhook 1/1 1 1 29m
deployment.apps/topolvm-controller 2/2 2 2 29m
NAME DESIRED CURRENT READY AGE
replicaset.apps/topolvm-cert-manager-657b7864b7 1 1 1 29m
replicaset.apps/topolvm-cert-manager-cainjector-57fbb46b78 1 1 1 29m
replicaset.apps/topolvm-cert-manager-webhook-85bff86bcc 1 1 1 29m
replicaset.apps/topolvm-controller-5dd4b498d9 2 2 2 29m
NAME STATUS COMPLETIONS DURATION AGE
job.batch/topolvm-cert-manager-startupapicheck Complete 1/1 9m26s 29m
vagrant@debian12:~$
the question is: what is the managed mode purpose?
about the managed mode, I could eventually try it but probably not wih Kind:
I disabled cert manager but obvioussly, cert manager is still required, or at least TLS certificates are required to be provided to be mounted onto topolvm pods:
The question how does the kind example work (it cannot work with cert manager)