Describe the bug
Pods requesting NVMeoF-backed PVCs fail to have the storage attached to their nodes on nodes where the /proc/mounts content is lengthy. When this occurs, the nvme command emits something to stderr. I believe Trident is erroneously capturing the stderr message and failing.
cn1001:~ # nvme list
libhugetlbfs: ERROR: Line too long when parsing mounts
Node Generic SN Model Namespace Usage Format FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 /dev/ng0n1 81MrpNW3EewqAAAAAAAB NetApp ONTAP Controller 1 53.69 GB / 53.69 GB 4 KiB + 0 B FFFFFFFF
On the node itself, I can manually (i.e., without trident) attach to the storage over NVMeoF, format the block device, etc.
Environment
Provide accurate information about the environment to help us reproduce the issue.
Trident version: 24.02.0
Container runtime: containerd://1.7.11-k3s2
Kubernetes version: v1.27.10+rke2r1
Kubernetes orchestrator: Harvester v1.3.0
OS: SLE micro 5.4
NetApp backend types: ONTAP A150
To Reproduce
Prepare a k8s cluster with connectivity to a NetApp with NVMeoF support
Configure the cluster with trident and configure a backend with ontap-san sanType nvme
Create a mount entry such that it shows up in /proc/mounts with a line size greater than 2048, which is enough to cause the libhugetlbfs stderr (reference)
Create a storage class that will target the previously created backend
Create a PVC referencing the storage class
Create a pod referencing the PVC
Expected behavior
The PVC should be dynamically provisioned, the pod should be scheduled to a node, the storage should attach to the node, and the pod should run with access to the storage.
Additional context
Logs attached as gathered from using tridentctl logs -n trident --node cn1003 --archive --sidecarssupport-2024-03-28T10-31-01-CDT.zip
Describe the bug Pods requesting NVMeoF-backed PVCs fail to have the storage attached to their nodes on nodes where the
/proc/mounts
content is lengthy. When this occurs, the nvme command emits something to stderr. I believe Trident is erroneously capturing the stderr message and failing.On the node itself, I can manually (i.e., without trident) attach to the storage over NVMeoF, format the block device, etc.
Environment Provide accurate information about the environment to help us reproduce the issue.
To Reproduce
/proc/mounts
with a line size greater than 2048, which is enough to cause the libhugetlbfs stderr (reference)Expected behavior The PVC should be dynamically provisioned, the pod should be scheduled to a node, the storage should attach to the node, and the pod should run with access to the storage.
Additional context Logs attached as gathered from using
tridentctl logs -n trident --node cn1003 --archive --sidecars
support-2024-03-28T10-31-01-CDT.zip