rootfs / snapshot

Kubernetes Volume Snapshot Controller using Custom Resource Definition
Apache License 2.0
12 stars 7 forks source link

the snapshot with error status cannot be removed successfully from cinder backend #16

Open freesky-edward opened 7 years ago

freesky-edward commented 7 years ago

What happened: when failed to create volume snapshot(error status cause by some reason from backend), I attempted to delete that volume-snapshot, the data in Kubernetes can be deleted successfully, however, the data in Cinder was not deleted successfully.

What you expected to happen: I expect the snapshot data in backend should also be deleted when deleting a snapshot with error status. so that there is no residual useless data.

How to reproduce it (as minimally and precisely as possible):

  1. create snapshot "kc create -f ../snapshot.yaml"
  2. check snapshot status: root@kube-karbor:/opt/kube/kubernetes# kc describe volumesnapshot Name: snapshot-demo Namespace: default Labels: Annotations: API Version: volume-snapshot-data.external-storage.k8s.io/v1 Kind: VolumeSnapshot Metadata: Cluster Name:
    Creation Timestamp: 2017-08-23T07:38:10Z Generation: 0 Resource Version: 16203 Self Link: /apis/volume-snapshot-data.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/snapshot-demo UID: 0322489e-87d6-11e7-a4d5-fa163e1a1ced Spec: Persistent Volume Claim Name: cinder-claim1 Snapshot Data Name: k8s-volume-snapshot-03601c96-87d6-11e7-a4c8-fa163e1a1ced Status: Conditions: Last Transition Time: 2017-08-23T07:38:10Z Message: Snapshot created succsessfully Reason:
    Status: True Type: Ready Creation Timestamp: Events: `
  3. check cinder "openstack volume snapshot list" +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+ | 904bb9d1-00ea-4004-9899-a1c111b7a970 | pvc-1f93e356-87b7-11e7-a4d5-fa163e1a1ced1503473890359873214 | kubernetes snapshot | error | 1 | +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+
  4. delete the snapshot "kc delete volumesnapshot snapshot-demo"
  5. check snapshot in Kubernetes "kc get volumesnapshot" No resources found.
  6. check snapshot in Cinder "openstack volume snapshot list" +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+ | 904bb9d1-00ea-4004-9899-a1c111b7a970 | pvc-1f93e356-87b7-11e7-a4d5-fa163e1a1ced1503473890359873214 | kubernetes snapshot | error | 1 | +--------------------------------------+-------------------------------------------------------------+---------------------+--------+------+

Anything else we need to know?: logs when creating snapshot: E0823 07:44:33.472653 15606 snapshotter.go:381] Failed to schedule the operation "createdefault/snapshot-democinder-claim1": Failed to create operation with name "createdefault/snapshot-democinder-claim1". An operation with that name failed at 2017-08-23 07:44:18.310211022 +0000 UTC m=+592.603224177. No retries permitted until 2017-08-23 07:46:18.310211022 +0000 UTC m=+712.603224177 (2m0s). Last error: "snapshot is not completed yet: current snapshot status is: error". E0823 07:44:33.572948 15606 snapshotter.go:381] Failed to schedule the operation "createdefault/snapshot-democinder-claim1": Failed to create operation with name "createdefault/snapshot-democinder-claim1". An operation with that name failed at 2017-08-23 07:44:18.310211022 +0000 UTC m=+592.603224177. No retries permitted until 2017-08-23 07:46:18.310211022 +0000 UTC m=+712.603224177 (2m0s). Last error: "snapshot is not completed yet: current snapshot status is: error".

logs when deleting snapshot: I0823 07:49:53.722128 15606 snapshot-controller.go:240] [CONTROLLER] OnDelete /apis/volume-snapshot-data.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/snapshot-demo, snapshot name: default/snapshot-demo I0823 07:49:53.722300 15606 desired_state_of_world.go:83] Deleting snapshot from desired state of world: default/snapshot-demo

Environment:

rootfs commented 7 years ago

This is a very interesting test case. It points to a case not addressed in snapshot controller.

During create:

During delete:

Now in this test case, snapshot was not created, hence it was not added to asw. As a result, reconciler didn't call snapshotter to delete the snapshot in the backend storage.

@xing-yang has made changes to address some error paths. We'll fix this.

@freesky-edward thanks for providing the info

xing-yang commented 7 years ago

Thanks @freesky-edward for reporting this bug. I'm working on improving the error handling. Will take this test case into account.

freesky-edward commented 7 years ago

@rootfs yeah, that's the cause , the volume data was not found in asw, so there was no request from reconciler. actually, Cinder can delete that data by command. @xing-yang thanks.