Closed dhs-rec closed 2 years ago
@dhs-rec have you installed zfs utils on the node. It seems like zfs utils are not there.
Sure. As I wrote above: This is a single host install using minikube. How should I have created the dataset if the command was not there? It's /sbin/zfs
.
@dhs-rec did you install the zfs utils after the ZFS-LocalPV deploymnet? Can you restart all the daemonset pods and then try?
No, I didn't. ZFS utils were already there since the system is running purely on ZFS. Will restart the pods anyway...
Also, can you exec into one of the node daemonset pod and look for ls -l /host/sbin/zfs
? See if it is present there or not?
% kubectl exec openebs-zfs-node-gj4d2 -n kube-system -c openebs-zfs-plugin -i -t -- ls -l /sbin/zfs
-r-xr-xr-x 1 root root 174 Jun 8 13:56 /sbin/zfs
% kubectl exec openebs-zfs-node-gj4d2 -n kube-system -c openebs-zfs-plugin -i -t -- ls -l /host/sbin/zfs
ls: cannot access '/host/sbin/zfs': No such file or directory
Just in case it might help:
% kubectl describe pods openebs-zfs-node-gj4d2 -n kube-system
Name: openebs-zfs-node-gj4d2
Namespace: kube-system
Priority: 900001000
Priority Class Name: openebs-zfs-csi-node-critical
Node: minikube/192.168.49.2
Start Time: Wed, 08 Jun 2022 15:56:39 +0200
Labels: app=openebs-zfs-node
controller-revision-hash=7c6d8f8bbd
openebs.io/component-name=openebs-zfs-node
openebs.io/version=2.1.0
pod-template-generation=1
role=openebs-zfs
Annotations: <none>
Status: Running
IP: 192.168.49.2
IPs:
IP: 192.168.49.2
Controlled By: DaemonSet/openebs-zfs-node
Containers:
csi-node-driver-registrar:
Container ID: docker://152fb1d8542d9dd33cb3ab6e410cc25f14b2da501efd95e933aa3d61e52b5daa
Image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.3.0
Image ID: docker-pullable://k8s.gcr.io/sig-storage/csi-node-driver-registrar@sha256:f9bcee63734b7b01555ee8fc8fb01ac2922478b2c8934bf8d468dd2916edc405
Port: <none>
Host Port: <none>
Args:
--v=5
--csi-address=$(ADDRESS)
--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
State: Running
Started: Fri, 10 Jun 2022 08:26:25 +0200
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 08 Jun 2022 15:56:50 +0200
Finished: Thu, 09 Jun 2022 08:31:10 +0200
Ready: True
Restart Count: 1
Environment:
ADDRESS: /plugin/csi.sock
DRIVER_REG_SOCK_PATH: /var/lib/kubelet/plugins/zfs-localpv/csi.sock
KUBE_NODE_NAME: (v1:spec.nodeName)
NODE_DRIVER: openebs-zfs
Mounts:
/plugin from plugin-dir (rw)
/registration from registration-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lrzzt (ro)
openebs-zfs-plugin:
Container ID: docker://1822ac6bd4d896d34cc65daaed1545b5c01f197584a4d993b55e01cc1e4fbc67
Image: openebs/zfs-driver:2.1.0
Image ID: docker-pullable://openebs/zfs-driver@sha256:3ac8c36998d099472aa5d67c208c3f50134a20ef3cbeb1165647bf06a451c14b
Port: <none>
Host Port: <none>
Args:
--nodename=$(OPENEBS_NODE_NAME)
--endpoint=$(OPENEBS_CSI_ENDPOINT)
--plugin=$(OPENEBS_NODE_DRIVER)
State: Running
Started: Fri, 10 Jun 2022 08:26:33 +0200
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 08 Jun 2022 15:57:09 +0200
Finished: Thu, 09 Jun 2022 08:31:21 +0200
Ready: True
Restart Count: 1
Environment:
OPENEBS_NODE_NAME: (v1:spec.nodeName)
OPENEBS_CSI_ENDPOINT: unix:///plugin/csi.sock
OPENEBS_NODE_DRIVER: agent
OPENEBS_NAMESPACE: openebs
ALLOWED_TOPOLOGIES: All
Mounts:
/dev from device-dir (rw)
/home/keys from encr-keys (rw)
/host from host-root (ro)
/plugin from plugin-dir (rw)
/sbin/zfs from chroot-zfs (rw,path="zfs")
/var/lib/kubelet/ from pods-mount-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lrzzt (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
device-dir:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType: Directory
encr-keys:
Type: HostPath (bare host directory volume)
Path: /home/keys
HostPathType: DirectoryOrCreate
chroot-zfs:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: openebs-zfspv-bin
Optional: false
host-root:
Type: HostPath (bare host directory volume)
Path: /
HostPathType: Directory
registration-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins_registry/
HostPathType: DirectoryOrCreate
plugin-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins/zfs-localpv/
HostPathType: DirectoryOrCreate
pods-mount-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/
HostPathType: Directory
kube-api-access-lrzzt:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 39m kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 39m kubelet Container image "k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.3.0" already present on machine
Normal Created 39m kubelet Created container csi-node-driver-registrar
Normal Started 39m kubelet Started container csi-node-driver-registrar
Normal Pulled 39m kubelet Container image "openebs/zfs-driver:2.1.0" already present on machine
Normal Created 39m kubelet Created container openebs-zfs-plugin
Normal Started 38m kubelet Started container openebs-zfs-plugin
@dhs-rec how did you install the ZFS-LocalPV (operator yaml or helm)? did you modify anything there?
Can you also run this command which zfs
on the node and paste the output here.
By using the command on the project page:
kubectl apply -f https://openebs.github.io/charts/zfs-operator.yaml
Didn't do any modification.
% kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- which zfs
/sbin/zfs
(I restarted from scratch, hence the different node name. But problem still persists.)
can you run this on the node :
which zfs
Also, can you exec inside the node daemonset and do ls /host
.
I want to see why the binary is not available inside the daemonset pod.
% which zfs
/sbin/zfs
% kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls /host
Release.key data etc kind lib64 mnt root srv usr
bin dev home lib libx32 opt run sys var
boot docker.key kic.txt lib32 media proc sbin tmp
@dhs-rec we mount root(/) directory from node as /host inside the container. Can you check this also
kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls /host/sbin
and
kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls /host/sbin/zfs
Need to check if root mount is there then why zfs binaries are not there.
% kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls /host/sbin
add-shell getcap pam-auth-update
addgnupghome getpcaps pam_extrausers_chkpwd
addgroup getty pam_extrausers_update
adduser groupadd pam_getenv
agetty groupdel pam_tally
applygnupgdefaults groupmems pam_tally2
arpd groupmod pam_timestamp_check
arptables grpck pivot_root
arptables-nft grpconv policy-rc.d
arptables-nft-restore grpunconv poweroff
arptables-nft-save halt pwck
arptables-restore hwclock pwconv
arptables-save iconvconfig pwunconv
badblocks init raw
blkdiscard initctl readprofile
blkid insmod reboot
blkmapd installkernel remove-shell
blkzone invoke-rc.d request-key
blockdev ip resize2fs
bridge ip6tables rmmod
capsh ip6tables-apply rmt
cfdisk ip6tables-legacy rmt-tar
chcpu ip6tables-legacy-restore rpc.gssd
chgpasswd ip6tables-legacy-save rpc.idmapd
chmem ip6tables-nft rpc.statd
chpasswd ip6tables-nft-restore rpc.svcgssd
chroot ip6tables-nft-save rpcbind
conntrack ip6tables-restore rpcdebug
cpgr ip6tables-restore-translate rpcinfo
cppw ip6tables-save rtacct
criu ip6tables-translate rtcwake
criu-ns iptables rtmon
ctrlaltdel iptables-apply runlevel
debugfs iptables-legacy runuser
delgroup iptables-legacy-restore service
deluser iptables-legacy-save setcap
depmod iptables-nft sfdisk
devlink iptables-nft-restore shadowconfig
dnsmasq iptables-nft-save showmount
dpkg-preconfigure iptables-restore shutdown
dpkg-reconfigure iptables-restore-translate sm-notify
dumpe2fs iptables-save sshd
e2freefrag iptables-translate start-statd
e2fsck isosize start-stop-daemon
e2image key.dns_resolver sulogin
e2label killall5 swaplabel
e2mmpstatus ldattach swapoff
e2scrub ldconfig swapon
e2scrub_all ldconfig.real switch_root
e2undo logsave sysctl
e4crypt losetup tarcat
e4defrag lsmod tc
ebtables minikube-automount telinit
ebtables-legacy mke2fs tipc
ebtables-legacy-restore mkfs tune2fs
ebtables-legacy-save mkfs.bfs tzconfig
ebtables-nft mkfs.cramfs umount.nfs
ebtables-nft-restore mkfs.ext2 umount.nfs4
ebtables-nft-save mkfs.ext3 unix_chkpwd
ebtables-restore mkfs.ext4 unix_update
ebtables-save mkfs.minix update-ca-certificates
ebtablesd mkhomedir_helper update-mime
ebtablesu mklost+found update-passwd
ethtool mkswap update-rc.d
fdformat modinfo useradd
fdisk modprobe userdel
filefrag mount.fuse usermod
findfs mount.fuse3 vigr
fsck mount.nfs vipw
fsck.cramfs mount.nfs4 visudo
fsck.ext2 mountstats wipefs
fsck.ext3 newusers xtables-legacy-multi
fsck.ext4 nfnl_osf xtables-monitor
fsck.minix nfsidmap xtables-nft-multi
fsfreeze nfsiostat zic
fstab-decode nfsstat zramctl
fstrim nologin
genl osd_login
% kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls /host/sbin/zfs
ls: cannot access '/host/sbin/zfs': No such file or directory
this is weird. On the node, zfs binary is present at the path /sbin/zfs. Node daemonset has mounted / on /host inside the container, so the path inside the container should be /host/sbin/zfs, but it is not present there. Strange!!
Can you do ls -l /sbin/zfs
on the node to check the permission?
Sure
% ls -l =zfs
-rwxr-xr-x 1 root root 139336 Okt 12 2021 /sbin/zfs
Hmmm, there are many more commands missing in /host/sbin as compared to the host's /sbin: xfs-repair, zpool, zvol-wait... (that's just from the end of the list).
OK, guess I found the reason:
% kubectl exec openebs-zfs-node-5wvj7 -n kube-system -c openebs-zfs-plugin -i -t -- ls -l /host/sbin
lrwxrwxrwx 1 root root 8 Apr 1 2021 /host/sbin -> usr/sbin
But there's no such symlink on the host.
OK, what is mounted into the node container is the root of the minikube container:
% docker exec -it minikube ls -l /sbin
lrwxrwxrwx 1 root root 8 Apr 1 2021 /sbin -> usr/sbin
Yep, indeed. After installing zfsutils-linux
into that container the PVC becomes ready.
OK, so I was erroneously thinking that the cluster setup by minikube was running on the host, but it's really running completely inside that container.
Will ask them to add zfsutils-linux
to it...
Thanks a lot for your patience.
What steps did you take and what happened: After setting up a Kubernetes playground with
minikube
on an Ubuntu system that runs on top of ZFS, I installed this storage manager following instructions on the project page, created a ZFS datasetdata/kubernetes
and a storage class usingThen I tried to create a PVC using
which is created but stays in PENDING state.
What did you expect to happen: Have a PVC created and ready to be used
The output of the following commands will help us better understand what's going on:
kubectl logs -f openebs-zfs-controller-0 -n kube-system -c openebs-zfs-plugin
kubectl logs -f openebs-zfs-node-[xxxx] -n kube-system -c openebs-zfs-plugin
So it seems it can't find the
zfs
command.kubectl get pods -n kube-system
kubectl get zv -A -o yaml
Environment:
ZFS-LocalPV version: Current (as of yesterday)
Kubernetes version (use
kubectl version
): 1.23.3Kubernetes installer & version: N/A
Cloud provider or hardware configuration: Local minikube install
OS (e.g. from
/etc/os-release
): Ubuntu 20.04