NetApp / trident

Storage orchestrator for containers
Apache License 2.0
732 stars 218 forks source link

ontap-san with nvme breaks when /proc/mounts is excessive #896

Open magicite opened 3 months ago

magicite commented 3 months ago

Describe the bug Pods requesting NVMeoF-backed PVCs fail to have the storage attached to their nodes on nodes where the /proc/mounts content is lengthy. When this occurs, the nvme command emits something to stderr. I believe Trident is erroneously capturing the stderr message and failing.

cn1001:~ # nvme list
libhugetlbfs: ERROR: Line too long when parsing mounts
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            81MrpNW3EewqAAAAAAAB NetApp ONTAP Controller                  1          53.69  GB /  53.69  GB      4 KiB +  0 B   FFFFFFFF

On the node itself, I can manually (i.e., without trident) attach to the storage over NVMeoF, format the block device, etc.

Environment Provide accurate information about the environment to help us reproduce the issue.

To Reproduce

  1. Prepare a k8s cluster with connectivity to a NetApp with NVMeoF support
  2. Configure the cluster with trident and configure a backend with ontap-san sanType nvme
  3. Create a mount entry such that it shows up in /proc/mounts with a line size greater than 2048, which is enough to cause the libhugetlbfs stderr (reference)
  4. Create a storage class that will target the previously created backend
  5. Create a PVC referencing the storage class
  6. Create a pod referencing the PVC

Expected behavior The PVC should be dynamically provisioned, the pod should be scheduled to a node, the storage should attach to the node, and the pod should run with access to the storage.

Additional context Logs attached as gathered from using tridentctl logs -n trident --node cn1003 --archive --sidecars support-2024-03-28T10-31-01-CDT.zip