Closed klacus closed 1 year ago
Thanks for reporting this. There's a LOG_LEVEL environment variable on the CSI node driver that needs to be set to "trace".
kubectl -n hpe-storage edit ds/hpe-csi-node -n hpe-storage
Restart the experiment and post the log file.
Could you also double-check that your "Global Target Configuration" base name is set to iqn.2011-08.org.truenas.ctl
as prescribed here: https://github.com/hpe-storage/truenas-csp/blob/master/INSTALL.md#configure-truenasfreenas
Hi Michael,
Thanks for looking into this!
Yes, the "Base Name" under the "Target Global Configuration" on TreueNAS Scale is set to "iqn.2011-08.org.truenas.ctl".
Attached the log files after setting the LOG_LEVEL to "trace" and restarted the entire K8s cluster. The "hpe-storage/truenas-csp-6f9bb9b94f-m99sq" pod does not have any log entry.
hpe-storage-hpe-csi-node-4qlx9-1673485064248579168 (1).log hpe-storage-hpe-csi-controller-fbdf874d7-lblpf-1673484953953757288.log
Thanks for the update. I find this very strange. Can you post what your /etc/multipath.conf file looks like on the node? You can also delete your /etc/multipath.conf file to have one rewritten by the node driver (requires node driver restart).
After letting the node driver recreate the /etc/multipath.conf, the PV mounting started to work as expected.
Thanks for driving me to the right direction to fix my lab configuration!
It was indeed a misconfiguration in my lab system.
This was the /etc/multipath.conf on the worker node originally (assuming default Ubuntu) :
root@k8sw1:/etc# cat multipath.conf
defaults {
find_multipaths no
user_friendly_names yes
}
devices {
device {
hardware_handler "1 alua"
dev_loss_tmo infinity
path_checker tur
product "Server"
prio alua
vendor "Nimble"
path_grouping_policy group_by_prio
fast_io_fail_tmo 5
failback immediate
path_selector "service-time 0"
no_path_retry 30
}
device {
path_grouping_policy group_by_prio
path_checker tur
failback immediate
hardware_handler "1 alua"
product "VV"
vendor "3PARdata"
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
features "0"
checker tur
path_selector "round-robin 0"
rr_min_io 100
no_path_retry 18
prio alua
}
}
After deleting it, it changed to this at next reboot of the node:
root@k8sw1:/etc# cat ./multipath.conf
defaults {
user_friendly_names yes
find_multipaths no
uxsock_timeout 10000
}
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
device {
product ".*"
vendor ".*"
}
}
blacklist_exceptions {
property "(ID_WWN|SCSI_IDENT_.*|ID_SERIAL)"
device {
vendor "Nimble"
product "Server"
}
device {
vendor "3PARdata"
product "VV"
}
device {
vendor "TrueNAS"
product "iSCSI Disk"
}
device {
product "iSCSI Disk"
vendor "FreeNAS"
}
}
devices {
device {
vendor "Nimble"
path_checker tur
no_path_retry 30
prio alua
product "Server"
rr_weight uniform
dev_loss_tmo infinity
path_grouping_policy group_by_prio
hardware_handler "1 alua"
fast_io_fail_tmo 5
rr_min_io_rq 1
path_selector "service-time 0"
failback immediate
}
device {
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
checker tur
vendor "3PARdata"
path_selector "round-robin 0"
path_grouping_policy group_by_prio
no_path_retry 18
failback immediate
rr_min_io 100
path_checker tur
features "0"
product "VV"
hardware_handler "1 alua"
prio alua
}
device {
rr_weight priorities
uid_attribute ID_SERIAL
vendor "TrueNAS"
product "iSCSI Disk"
path_grouping_policy group_by_prio
path_selector "queue-length 0"
}
device {
path_grouping_policy group_by_prio
path_selector "queue-length 0"
hardware_handler "1 alua"
rr_weight priorities
uid_attribute ID_SERIAL
vendor "FreeNAS"
product "iSCSI Disk"
}
}
Hi, I am posting the issue here with the log files as asked by
datamattsson
on SlackI am having issues using Permanent Volumes hosted on TrueNas Scale. I hope this is the right forum to ask this question and someone could tell me what I am doing wrong.The volume gets provisioned on TrueNas and the PV gets created on K8s too. However when the pod starts there is an error message about kubelet failing to mount the volume.
Node OS:
Linux k8sw2 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Release:
TrueNas Scale
TrueNAS-SCALE-22.02.4
K8s version:HPE CSI Version:
TrueNas CSP: quay.io/datamattsson/truenas-csp:v2.2.0
The csi driver is installed by the TrueNas CSP helm chart 1.1.2 from https://artifacthub.io/packages/helm/truenas-csp/truenas-csp The error message from output of describing the pod:
And I noticed an error on the console of the worker node:
I also noticed a warning when added a new node (with kubeadm) to the existing cluster about blkio:
The StatefulSet definition:
When the container is creating the TrueNas side looks OK. The Targets, Extents and Associated Extents are there along with the Initiaror. Portal configuration looks OK too. Communication with TrueNas seems OK. Attached images for reference.
The Storage class definition (API key cleared, although the device is no exposed to the Internet):
The PV and PVC created.
Multipath seems to be OK on the node too:
ISCSI discovery is OK too:
I am out of ideas why the volume can not mount. Any help is appreciated!
Attachments:
CSI driver log file from the node where the volume should be mounted: hpe-csi-node-99kgz.log
Log files from containers of the csi controller and csi node: hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261246107108993.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261253968234771.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261262104117602.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261269954362084.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261277590043596.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261288083010354.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261296357269650.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261301719055280.log hpe-storage-hpe-csi-controller-65f4f97cb7-jrr5g-1672261313704105071.log hpe-storage-hpe-csi-node-99kgz-1672260886767868056.log hpe-storage-hpe-csi-node-99kgz-1672260898900143158.log hpe-storage-hpe-csi-node-99kgz-1672260937175006234.log hpe-storage-hpe-csi-node-99kgz-1672260945656099461.log