kubernetes-retired / external-storage

[EOL] External storage plugins, provisioners, and helper libraries
Apache License 2.0
2.7k stars 1.6k forks source link

[cephfs] Deletion pvc issues with ceph 14 #1300

Closed xriser closed 4 years ago

xriser commented 4 years ago

Using with ceph v14 provisioner doesn't delete volume from ceph and user secret as well The error like follow: I0321 18:04:38.854980 1 controller.go:1158] delete "pvc-1a61b1c9-9dcf-41a7-b8fc-183799545396": started E0321 18:04:39.936090 1 cephfs-provisioner.go:268] failed to delete share "tst-pvc" for "k8s.default.tst-pvc", err: exit status 1, output: Traceback (most recent call last): File "/usr/local/bin/cephfs_provisioner", line 364, in <module> main() File "/usr/local/bin/cephfs_provisioner", line 360, in main cephfs.delete_share(share, user) File "/usr/local/bin/cephfs_provisioner", line 319, in delete_share self._deauthorize(volume_path, user_id) File "/usr/local/bin/cephfs_provisioner", line 260, in _deauthorize pool_name = self.volume_client._get_ancestor_xattr(path, "ceph.dir.layout.pool") File "/lib/python2.7/site-packages/ceph_volume_client.py", line 756, in _get_ancestor_xattr result = self.fs.getxattr(path, attr) File "cephfs.pyx", line 954, in cephfs.LibCephFS.getxattr (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.1/rpm/el7/BUILD/ceph-13.2.1/build/src/pybind/cephfs/pyrex/cephfs.c:10083) cephfs.ObjectNotFound: [Errno 2] error in getxattr E0321 18:04:39.936184 1 controller.go:1181] delete "pvc-1a61b1c9-9dcf-41a7-b8fc-183799545396": volume deletion failed: exit status 1 W0321 18:04:39.936357 1 controller.go:787] Retrying syncing volume "pvc-1a61b1c9-9dcf-41a7-b8fc-183799545396" because failures 0 < threshold 15 E0321 18:04:39.936437 1 controller.go:802] error syncing volume "pvc-1a61b1c9-9dcf-41a7-b8fc-183799545396": exit status 1 I0321 18:04:39.936499 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolume", Namespace:"", Name:"pvc-1a61b1c9-9dcf-41a7-b8fc-183799545396", UID:"13a518c8-4512-4bfe-a643-2fac09dd06b5", APIVersion:"v1", ResourceVersion:"492116", FieldPath:""}): type: 'Warning' reason: 'VolumeFailedDelete' exit status 1

xriser commented 4 years ago

Tried mimic and nautilus, rebuilt provisioner docker image to the latest code, tried different images, checked fattr as @zhoubofsy mentioned here https://github.com/kubernetes-incubator/external-storage/issues/860 , nothing helped Also checked secrets and so on.

The issue, as follows: pvc provisions well, client secret creates in cephfs namespace as well. Then when I deleted pvc - it deleted, pv still remaining with Released status, the data from pvc actually remaining in the ceph storage, the client secret does not delete, and the error in the provisioner pod. Then I can delete pv and user secret by hands - it is ok. But the data in the ceph storage still remaining. Then if I create pvc with the same name - it will be with the previously stored data. E0322 13:16:42.082854 1 cephfs-provisioner.go:272] failed to delete share "data-elasticsearch-elasticsearch-master-1" for "k8s.efk.data-elasticsearch-elasticsearch-master-1", err: exit status 1, output: Traceback (most recent call last): File "/usr/local/bin/cephfs_provisioner", line 364, in <module> main() File "/usr/local/bin/cephfs_provisioner", line 360, in main cephfs.delete_share(share, user) File "/usr/local/bin/cephfs_provisioner", line 319, in delete_share self._deauthorize(volume_path, user_id) File "/usr/local/bin/cephfs_provisioner", line 260, in _deauthorize pool_name = self.volume_client._get_ancestor_xattr(path, "ceph.dir.layout.pool") File "/lib/python2.7/site-packages/ceph_volume_client.py", line 800, in _get_ancestor_xattr result = self.fs.getxattr(path, attr).decode() File "cephfs.pyx", line 1099, in cephfs.LibCephFS.getxattr (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.8/rpm/el7/BUILD/ceph-14.2.8/build/src/pybind/cephfs/pyrex/cephfs.c:11926) cephfs.ObjectNotFound: error in getxattr: No such file or directory [Errno 2] E0322 13:16:42.083877 1 controller.go:1120] delete "pvc-be69ffe3-6925-4408-a160-701ef72e44cc": volume deletion failed: exit status 1 W0322 13:16:42.083999 1 controller.go:726] Retrying syncing volume "pvc-be69ffe3-6925-4408-a160-701ef72e44cc" because failures 0 < threshold 15 E0322 13:16:42.084050 1 controller.go:741] error syncing volume "pvc-be69ffe3-6925-4408-a160-701ef72e44cc": exit status 1 I0322 13:16:42.084105 1 controller.go:1097] delete "pvc-89f664c4-82d5-4d7d-b189-e2cf1d084908": started I0322 13:16:42.084316 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolume", Namespace:"", Name:"pvc-be69ffe3-6925-4408-a160-701ef72e44cc", UID:"f2aa4d67-4fbd-43ee-9632-608fb13c1f5d", APIVersion:"v1", ResourceVersion:"440534", FieldPath:""}): type: 'Warning' reason: 'VolumeFailedDelete' exit status 1 E0322 13:16:42.127103 1 cephfs-provisioner.go:272] failed to delete share "data-elasticsearch-elasticsearch-data-0" for "k8s.efk.data-elasticsearch-elasticsearch-data-0", err: exit status 1, output: Traceback (most recent call last): File "/usr/local/bin/cephfs_provisioner", line 364, in <module> main() File "/usr/local/bin/cephfs_provisioner", line 360, in main cephfs.delete_share(share, user) File "/usr/local/bin/cephfs_provisioner", line 319, in delete_share self._deauthorize(volume_path, user_id) File "/usr/local/bin/cephfs_provisioner", line 260, in _deauthorize pool_name = self.volume_client._get_ancestor_xattr(path, "ceph.dir.layout.pool") File "/lib/python2.7/site-packages/ceph_volume_client.py", line 800, in _get_ancestor_xattr result = self.fs.getxattr(path, attr).decode() File "cephfs.pyx", line 1099, in cephfs.LibCephFS.getxattr (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.8/rpm/el7/BUILD/ceph-14.2.8/build/src/pybind/cephfs/pyrex/cephfs.c:11926) cephfs.ObjectNotFound: error in getxattr: No such file or directory [Errno 2] E0322 13:16:42.127316 1 controller.go:1120] delete "pvc-d0c9a302-68a0-42b9-a1f3-23b3ccc931d8": volume deletion failed: exit status 1 W0322 13:16:42.127666 1 controller.go:726] Retrying syncing volume "pvc-d0c9a302-68a0-42b9-a1f3-23b3ccc931d8" because failures 0 < threshold 15 E0322 13:16:42.128670 1 controller.go:741] error syncing volume "pvc-d0c9a302-68a0-42b9-a1f3-23b3ccc931d8": exit status 1

AlawnWong commented 4 years ago

I encountered the same problem

xriser commented 4 years ago

@AlawnWong this project is outdated and no longer supported. I have switched to the Ceph-csi https://github.com/ceph/ceph-csi It works perfectly.

Ranler commented 4 years ago

@xriser which ceph version do you used? this error seems happened in ceph luminous, but no in ceph nautilus。

xriser commented 4 years ago

@Ranler, as I said I have tried mimic and nautilus.

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

lowang-bh commented 4 years ago

I have found the solutions: no CEPH_VOLUME_GROUP enviroment when delete pv the fix is at : https://github.com/kubernetes-incubator/external-storage/commit/fc016bc700030bb61ee3ff929a1ae4de28fe5cd0

nikhita commented 4 years ago

Thanks for reporting the issue!

This repo is no longer being maintained and we are in the process of archiving this repo. Please see https://github.com/kubernetes/org/issues/1563 for more details.

If your issue relates to nfs provisioners, please create a new issue in https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner or https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner.

Going to close this issue in order to archive this repo. Apologies for the churn and thanks for your patience! :pray: