We ran into an issue today when attempting to re-size one of our volumes. Kubernetes accepted the PVC Edit but the volume stayed at its original size. Inspecting the volume claim listed the following events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ExternalExpanding 3m8s volume_expand Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
Warning ResizeFailed 39s (x2 over 3m7s) netapp.io/trident failed in resizing the volume or PV: unable to resize the volume: volume trident_rd_prod_default_jenkins_d7cb2 does not exist
Warning ExternalExpanding 39s (x2 over 2m39s) volume_expand Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
Inspecting the trident logs, we were seeing the same thing:
time="2020-02-18T18:21:25Z" level=info msg="GetBackend information." backend="&{0xc4201b81e0 trident-rd-prod true online map[aggr1_N1:0xc4206e6bc0] map[default-jenkins-d7cb2:0xc420328f00 ...]}" ... backendExternal.Name=trident-rd-prod backendExternal.State=online
time="2020-02-18T18:21:25Z" level=info msg="GetBackend information." backend="&{0xc42059d1e0 trident-rd-prod-bootstrap true online map[aggr5_N1:0xc42046ec00] map[]}" ... backendExternal.Name=trident-rd-prod-bootstrap backendExternal.State=online
time="2020-02-18T18:21:55Z" level=error msg="Unable to resize the volume." backend=trident-rd-prod current_size=53687091200 error="volume trident_rd_prod_default_jenkins_d7cb2 does not exist" new_size=107374182400 volume=default-jenkins-d7cb2 volume_internal=trident_rd_prod_default_jenkins_d7cb2
time="2020-02-18T18:21:55Z" level=warning msg="Unable to clean up artifacts of volume resize: unable to resize the volume: volume trident_rd_prod_default_jenkins_d7cb2 does not exist. Repeat resizing the volume or restart trident."
time="2020-02-18T18:21:55Z" level=error msg="Kubernetes frontend failed in resizing the volume or PV: unable to resize the volume: volume trident_rd_prod_default_jenkins_d7cb2 does not exist" PVC=jenkins
tridentctl get volume was able to find the volume, and it was not orphaned:
We turned on debug logging for trident and found this interesting log line:
time="2020-02-18T19:30:03Z" level=debug msg="Attempting to acquire shared lock (prune)." lock=e51b38cd-9a63-11e8-80d6-00a0988d169a-trident_rd_prod
time="2020-02-18T19:30:03Z" level=debug msg="Logged EMS message." driver=ontap-nas-economy
time="2020-02-18T19:30:03Z" level=debug msg="Started quota resize." flexvol=trident_qtree_pool_trident_rd_prod_CBUJTDOJOR
time="2020-02-18T19:30:03Z" level=debug msg="Started quota resize." flexvol=trident_qtree_pool_trident_rd_prod_SXJDJXBCFC
time="2020-02-18T19:30:04Z" level=debug msg="Error resizing quotas." error="API status: failed, Reason: No valid quota rules found in quota policy default for volume trident_qtree_pool_trident_rd_prod_SXJDJXBCFC_clone_10022020_163446_87 in Vserver cnas02-trident. , Code: 14958" flexvol=trident_qtree_pool_trident_rd_prod_SXJDJXBCFC_clone_10022020_163446_87
After taking the clone offline and re-starting trident, we were able to resize the volume successfully. I am not sure if the conflict is due to the way the clone was named or if creating a quota rule on the clone would have corrected the issue. If this error was logged by default (instead of being put behind the -debug switch) it would have saved us quite a bit of troubleshooting.
Environment
Trident version: 19.04.1
Trident installation flags used: -n trident
Container runtime: Docker 18.9.6
Kubernetes version: 1.13.5
Kubernetes orchestrator: Rancher 2.2.4
Kubernetes enabled feature gates: defaults
OS: Ubuntu 18.04
NetApp backend types: ontap-nas-economy
Other:
To Reproduce
Steps to reproduce the behavior:
Install Trident and configure an ontap-nas-economy backend
Create a Volume on the ontap-nas-economy backend
In NetApp, create a snapshot of the flexvol that houses the volume
Try to re-size the volume in kubernetes
Expected behavior
The volume is re-sized successfully
Additional context
The volume driver we're using doesn't appear to support creating clones directly from trident, so we created the clone on the NetApp directly.
Describe the bug
We ran into an issue today when attempting to re-size one of our volumes. Kubernetes accepted the PVC Edit but the volume stayed at its original size. Inspecting the volume claim listed the following events:
Inspecting the trident logs, we were seeing the same thing:
tridentctl get volume
was able to find the volume, and it was not orphaned:We turned on debug logging for trident and found this interesting log line:
After taking the clone offline and re-starting trident, we were able to resize the volume successfully. I am not sure if the conflict is due to the way the clone was named or if creating a quota rule on the clone would have corrected the issue. If this error was logged by default (instead of being put behind the
-debug
switch) it would have saved us quite a bit of troubleshooting.Environment
19.04.1
-n trident
18.9.6
1.13.5
2.2.4
18.04
ontap-nas-economy
To Reproduce Steps to reproduce the behavior:
ontap-nas-economy
backendontap-nas-economy
backendExpected behavior The volume is re-sized successfully
Additional context
The volume driver we're using doesn't appear to support creating clones directly from trident, so we created the clone on the NetApp directly.