NetApp / trident

Storage orchestrator for containers
Apache License 2.0
732 stars 218 forks source link

Volume size problem (metadata write failed, insufficient space on volume) #903

Open cfillot opened 2 months ago

cfillot commented 2 months ago

Describe the bug

When creating (and using) PVC based on NVMe/TCP backends, I get errors or alerts on a regular basis from the Netapp storage arrays like these:

Message: wafl.vol.full: Insufficient space on volume trident_pvc_d393164d_b0f6_4f4c_acf1_c0bf425fa537@vserver:fed5ad71-27a2-11ed-a1c6-d039ea4eed91 to perform operation. 4.00KB was requested but only 1.00KB was available.
Message: fp.est.scan.catalog.failed: Volume footprint estimator scan catalog update fails on "trident_pvc_cfee9e11_16ef_4bd4_8431_96f5e180b613" - Write to metafile failed.

I can manually resize volumes directly on the storage arrays to avoid these errors but of course it does not scale. As far as I understand it, 10% of additional space is added to the volume size for metadata, but it seems that this is not enough.

NVMe backends are configured as follows:

---
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: storage1-c-nvme
spec:
  version: 1
  backendName: storage1-c-nvme
  storageDriverName: ontap-san
  managementLIF: XXX.XXX.XXX.XXX
  useREST: true
  svm: Virt_c
  sanType: nvme
  storagePrefix: trident
  defaults:
    spaceReserve: volume
    snapshotReserve: '0'
  credentials:
    name: netapp-credentials
  supportedTopologies:
    - topology.kubernetes.io/zone: ZONE-A

Storage Class:

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: netapp-nvme-c
provisioner: csi.trident.netapp.io
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: True
parameters:
  fsType: "ext4"
allowedTopologies:
  - matchLabelExpressions:
    - key: topology.kubernetes.io/zone
      values:
        - ZONE-A

PVC:

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-san
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 300Mi
  storageClassName: netapp-nvme-c

BTW, if I don't specify "spaceReserve: volume" in backends, filling the EXT4 filesystem in a Pod makes the volume Read-Only (whereas this should only give a "Not enough space on device").

Environment

wonderland commented 2 months ago

Just for understanding, are any snapshots taken (either in K8s or directly in Ontap)?

In any case, the snapshotReserve parameter of the backend can be used to make the hosting volume a bit larger than the PVC itself to accommodate for snapshots (hence the name) but also any other metadata etc. You currently have it set to 0%, try setting it to 10 or 20 (unless you make heavy use of snapshots/clones along with high data change rates, then you might need more). Note that this is a percentage calculated with the Ontap volume size as the base.

cfillot commented 2 months ago

Thanks a lot for your suggestion! No, I don't use snapshots at all on these PVC/volumes, so I chose to set 0%. I didn't know this snapshot reserve space could also be used for metadata, I'll try that and let you know.

holtakj commented 2 months ago

We have the same setup except we use ubuntu 22.04 and I can confirm we have the same problem. For us , the 100% way to trigger it is to fill the drive, then delete the data and try to fill again.

We have a reproducer using FIO random write of a 16g file to a 20G volume.

The problem happens on overwrite. We can see same bahaviour using iSCSI when the volume is not mounted using the discard ext4 option. With discard active, overwriting is working fine for iSCSI. However discard does not work for us at all for NVMEoTCP and after reaching out to NetApp, we were told that it is not yet supported but we should be able to use thick provisionined volumes. We always had spaceAllocation: "true" on the trident backend but did not have the ext4 mount option.

NetApp told us to use thick provisioning, but setting spaceReserve: volume is not enough as it sets thick provisioning only on the volume level but our internal NetApp storage experts told us that in order to reach 100% overwrite capability we need thick provisioning also for the LUN. From a peek into the trident source the LUN is always thin-provisioned. We are changing the parameter manually.

This is currently a blocker for broader production use and we use trident and NetApp only for less important things as we never understood this issue completely.

Also, I should add, that this is all completely without snapshots.

alloydsa commented 3 days ago

Hi @cfillot, could you please provide us the following information to root cause this issue?

  1. Ontap array logs and where exactly you are getting this alert
  2. State of PVC (bound / pending)
  3. More information on volume in Ontap which is causing this issue ( available storage, metadata, settings)