stackabletech / secret-operator

Other
12 stars 6 forks source link

Undeletable orphan volumes are left behind #169

Open nightkr opened 2 years ago

nightkr commented 2 years ago

Affected version

459a608

Current and expected behavior

Sometimes deleted pods are stuck in Terminating. Inspecting the kubelet's log gives a lot of lines along the lines of:

E0815 10:11:34.087954       7 reconciler.go:189] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory"
E0815 10:11:34.188831       7 reconciler.go:189] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory"
E0815 10:11:34.289521       7 reconciler.go:189] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory"
E0815 10:11:34.390546       7 reconciler.go:189] "operationExecutor.UnmountVolume failed (controllerAttachDetachEnabled true) for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory" err="UnmountVolume.NewUnmounter failed for volume \"tls\" (UniqueName: \"kubernetes.io/csi/secrets.stackable.tech^072cd466-c642-49c9-8c72-ca6dd77b5c45\") pod \"517c6ac3-633e-4bd5-a678-0e23742ae13b\" (UID: \"517c6ac3-633e-4bd5-a678-0e23742ae13b\") : kubernetes.io/csi: unmounter failed to load volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/mount]: kubernetes.io/csi: failed to open volume data file [/var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json]: open /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851/vol_data.json: no such file or directory"

The volume folder (in this case: /var/lib/kubelet/pods/517c6ac3-633e-4bd5-a678-0e23742ae13b/volumes/kubernetes.io~csi/pvc-56b2fa07-4d3d-4cff-b8fc-28bc77903851) seems to still exist and contains the injected secret data, but no vol_data.json (as reported by the error message).

Possible solution

Investigate whether we're somehow deleting vol_data.json by accident, or if we're supposed to provide it. https://github.com/kubernetes/kubernetes/issues/101378 implies that this was a bug in K8s 1.22, but this still happens with 1.23.

Additional context

As a workaround, Kubelet gives up on proper cleanup and terminates the pods if you restart it. Obviously, this isn't a long-term solution.

Environment

Client Version: v1.23.4 Server Version: v1.23.5+k3s1

Would you like to work on fixing this bug?

yes

kbmanseau commented 2 years ago

@teozkr Have you found a temporary workaround to get past the issue? Our systems are seeing very large dumps in the logs regarding this issue.