kubernetes-csi / external-snapshotter

Sidecar container that watches Kubernetes Snapshot CRD objects and triggers CreateSnapshot/DeleteSnapshot against a CSI endpoint.
Apache License 2.0
484 stars 369 forks source link

VolumeGroupSnapshot deletion intermediate failures #1035

Open Madhu-1 opened 7 months ago

Madhu-1 commented 7 months ago

What happened:

The volumegroupsnapshot deletion is kind of stuck because the volumesnapshotcontent are already deleted https://github.com/kubernetes-csi/external-snapshotter/blob/fcf78d3d6964632ed7f8b85aa045d667b1da47d4/pkg/sidecar-controller/groupsnapshot_helper.go#L242-L249

What you expected to happen:

The volumegroupsnapshot deletion should happen

How to reproduce it:

It's sometimes happens not always

Environment:

Logs

I0314 11:31:02.878590       1 connection.go:244] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0314 11:31:02.878604       1 connection.go:245] GRPC request: {}
I0314 11:31:02.881202       1 connection.go:251] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":9}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":13}}}]}
I0314 11:31:02.881426       1 connection.go:252] GRPC error: <nil>
I0314 11:31:02.881516       1 snapshot_controller.go:291] checkandUpdateContentStatusOperation: driver rook-ceph.cephfs.csi.ceph.com, snapshotId 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, creationTime 0001-01-01 00:00:00 +0000 UTC, size 0, readyToUse true, groupSnapshotID 
I0314 11:31:02.881595       1 snapshot_controller.go:436] updateSnapshotContentStatus: updating VolumeSnapshotContent [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2], snapshotHandle 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81, readyToUse true, createdAt 1710415862881587581, size 0, groupSnapshotID 
I0314 11:31:03.061944       1 request.go:629] Waited for 183.142202ms due to client-side throttling, not priority and fairness, request: POST:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/namespaces/default/volumesnapshots
I0314 11:31:03.077873       1 groupsnapshot_helper.go:631] updateSnapshotContentStatus: updating VolumeGroupSnapshotContent [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4], groupSnapshotHandle 0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7, readyToUse true, createdAt 1710415862818578349
I0314 11:31:03.262020       1 request.go:629] Waited for 380.321987ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:03.461552       1 request.go:629] Waited for 383.527063ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:03.662096       1 request.go:629] Waited for 396.451068ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/snapshot.storage.k8s.io/v1/volumesnapshotcontents/snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2/status
I0314 11:31:03.673135       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673577       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.673726       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673739       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15298
I0314 11:31:03.673751       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.673784       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.861726       1 request.go:629] Waited for 395.824892ms due to client-side throttling, not priority and fairness, request: PUT:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4/status
I0314 11:31:03.863301       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:03.863408       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863423       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15299
I0314 11:31:03.863434       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:03.863458       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:03.879845       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.061161       1 request.go:629] Waited for 181.449744ms due to client-side throttling, not priority and fairness, request: PATCH:https://10.96.0.1:443/apis/groupsnapshot.storage.k8s.io/v1alpha1/volumegroupsnapshotcontents/groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071738       1 groupsnapshot_helper.go:617] Removed VolumeGroupSnapshotBeingCreated annotation from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:04.071862       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071895       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.071950       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.071984       1 util.go:246] storeObjectUpdate: ignoring groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" version 15300
I0314 11:31:04.072117       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:04.072207       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072249       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15301
I0314 11:31:04.072267       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:04.072286       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:04.261941       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:04.261994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.262006       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15302
I0314 11:31:04.263501       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:04.263557       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.430929       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.430977       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431004       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15316
I0314 11:31:14.431014       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.431037       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449621       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.449677       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449696       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15317
I0314 11:31:14.449705       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.449752       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.449768       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.449778       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] started
I0314 11:31:14.454424       1 connection.go:244] GRPC call: /csi.v1.GroupController/DeleteVolumeGroupSnapshot
I0314 11:31:14.454446       1 connection.go:245] GRPC request: {"group_snapshot_id":"0001-0009-rook-ceph-0000000000000001-bddb800e-12ad-4138-914d-6b46974e41e7","secrets":"***stripped***","snapshot_ids":["0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"]}
I0314 11:31:14.477950       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.477994       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478012       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15319
I0314 11:31:14.478123       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.478180       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493275       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:14.493330       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493454       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:14.493506       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:14.493519       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:14.493553       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:14.493600       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:14.493610       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:14.499437       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:14.499514       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:14.500512       1 connection.go:251] GRPC response: {}
I0314 11:31:14.500534       1 connection.go:252] GRPC error: rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists
E0314 11:31:14.500630       1 snapshot_controller_base.go:359] could not sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2": failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500646       1 snapshot_controller_base.go:230] Failed to sync content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", will retry again: failed to delete snapshot "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", err: failed to delete snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2: "rpc error: code = Aborted desc = an operation with the given Snapshot ID 0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81 already exists"
I0314 11:31:14.500733       1 event.go:364] Event(v1.ObjectReference{Kind:"VolumeSnapshotContent", Namespace:"", Name:"snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2", UID:"09c23a0d-206c-4256-99bc-500fe70514df", APIVersion:"snapshot.storage.k8s.io/v1", ResourceVersion:"15320", FieldPath:""}): type: 'Warning' reason: 'SnapshotDeleteError' Failed to delete snapshot
I0314 11:31:14.644781       1 connection.go:251] GRPC response: {}
I0314 11:31:14.644803       1 connection.go:252] GRPC error: <nil>
I0314 11:31:14.644831       1 groupsnapshot_helper.go:274] clearGroupSnapshotContentStatus content [groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.657400       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.658219       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.658273       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.658294       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4] should be deleted.
I0314 11:31:14.658303       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]: the policy is Delete
I0314 11:31:14.667918       1 groupsnapshot_helper.go:223] Removed protection finalizer from volume group snapshot content groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4
I0314 11:31:14.667947       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" with version 15323
I0314 11:31:14.667975       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.667989       1 groupsnapshot_helper.go:160] group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" deleted
I0314 11:31:14.668023       1 groupsnapshot_helper.go:53] enqueued "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" for sync
I0314 11:31:14.668064       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4]
I0314 11:31:14.668161       1 groupsnapshot_helper.go:117] deletion of group snapshot content "groupsnapcontent-f0ff3787-4f88-4cea-8708-44c7530b42c4" was already processed
I0314 11:31:15.500744       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500787       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15320
I0314 11:31:15.500800       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.500822       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.500830       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.500841       1 snapshot_controller.go:107] Deleting snapshot for content: snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.500847       1 snapshot_controller.go:379] deleteCSISnapshotOperation [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] started
I0314 11:31:15.505177       1 connection.go:244] GRPC call: /csi.v1.Controller/DeleteSnapshot
I0314 11:31:15.505201       1 connection.go:245] GRPC request: {"secrets":"***stripped***","snapshot_id":"0001-0009-rook-ceph-0000000000000001-5c98229b-0f9d-4448-9ccf-6822d3b87e81"}
I0314 11:31:15.514540       1 connection.go:251] GRPC response: {}
I0314 11:31:15.514573       1 connection.go:252] GRPC error: <nil>
I0314 11:31:15.514589       1 snapshot_controller.go:410] cleanVolumeSnapshotStatus content [snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539718       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.539829       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.539874       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.539894       1 snapshot_controller.go:626] Check if VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2] should be deleted.
I0314 11:31:15.539909       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]: the policy is Delete
I0314 11:31:15.558697       1 snapshot_controller.go:615] Removed protection finalizer from volume snapshot content snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2
I0314 11:31:15.558741       1 util.go:250] storeObjectUpdate updating content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" with version 15327
I0314 11:31:15.558776       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558780       1 snapshot_controller_base.go:208] enqueued "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" for sync
I0314 11:31:15.558798       1 snapshot_controller_base.go:369] content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" deleted
I0314 11:31:15.558816       1 snapshot_controller_base.go:249] syncContentByKey[snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2]
I0314 11:31:15.558836       1 snapshot_controller_base.go:284] deletion of content "snapcontent-00887a0a2191a9eca69c6bb4537f00bb03d7964de58d7e94047d993f8459e54b-2024-03-14-11.31.2" was already processed
I0314 11:31:16.597420       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:31:42.612012       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:08.623907       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:32:34.648338       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:00.664440       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:26.673596       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:33:52.693577       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:02.522756       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotContent total 15 items received
I0314 11:34:18.706758       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:44.723835       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:34:55.520666       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1.VolumeSnapshotClass total 9 items received
I0314 11:35:10.733656       1 leaderelection.go:281] successfully renewed lease rook-ceph/external-snapshotter-leader-rook-ceph-cephfs-csi-ceph-com
I0314 11:35:14.522928       1 reflector.go:800] github.com/kubernetes-csi/external-snapshotter/client/v7/informers/externalversions/factory.go:142: Watch close - *v1alpha1.VolumeGroupSnapshotContent total 175 items received
I0314 11:35:36.700702       1 groupsnapshot_helper.go:82] syncGroupSnapshotContentByKey[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700742       1 util.go:250] storeObjectUpdate updating groupsnapshotcontent "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304" with version 14478
I0314 11:35:36.700752       1 groupsnapshot_helper.go:165] synchronizing VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]
I0314 11:35:36.700772       1 groupsnapshot_helper.go:327] Check if VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] should be deleted.
I0314 11:35:36.700778       1 groupsnapshot_helper.go:168] VolumeGroupSnapshotContent[groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304]: the policy is Delete
I0314 11:35:36.700784       1 groupsnapshot_helper.go:233] deleteCSISnapshotOperation [groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304] started
E0314 11:35:36.704116       1 groupsnapshot_helper.go:149] could not sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304": failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found
I0314 11:35:36.704155       1 groupsnapshot_helper.go:71] Failed to sync group snapshot content "groupsnapcontent-874db022-42cd-43cc-ba50-f4c9b24d7304", will retry again: failed to get snapshot content snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17 from snapshot content store: volumesnapshotcontent.snapshot.storage.k8s.io "snapcontent-dbb73839488e549fc8e7e6b61a908f3c29d31cc61598fd9b626ea64b4dbf365d-2024-03-14-11.13.17" not found
Madhu-1 commented 6 months ago

This is also fixed by https://github.com/kubernetes-csi/external-snapshotter/pull/1011 as this updates the volumegroupsnapshotname in the snapshotcontent status

Madhu-1 commented 6 months ago

It looks like not fixed yet, reopening

jedops commented 5 months ago

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

Madhu-1 commented 5 months ago

Hello, I'm having some difficulty understanding the detail of this issue. Would this issue present itself by failing to delete a snapshot or would it somehow accidentally delete a snapshot. Would somebody be so kind enough to explain?

Thanks!

@jedops the snapshots are deleted internally when the volumegroupsnapshot are deleted, i have provided steps to reproduce and some logs as well, you can see some checks are missing to skip already deleted snapshots or we need reorder the steps on how we delete snapshot which are created as part of volumegroupsnapshot.

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

yati1998 commented 1 month ago

Just an update,

I tested the creation of deletion of volumegroupsnapshot with cephfs driver and it seems to work fine. To re-confirm I tried it again

yatipadia:ceph-csi$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-b8b1c10d-5c07-47c3-bc36-42d4294628e4   5h47m          5h47m
yatipadia:ceph-csi$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted
yati1998 commented 1 month ago

Just an update, I tried out the same with 10-11 pvcs, the volumegroupsnapshot was successfully deleted.

yatipadia:Documents$ kubectl get volumesnapshotcontent
NAME                                                                                              READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                          VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT                                                                                 VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-114d4ee02d9142894694e5f0d923333c1c840ec22baccf06bdef58d2d66bc1e1-2024-08-30-5.33.1    default                   4m58s
snapcontent-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-1aacbcb8f80a5c913552ca17c2119725081584242e63e3e6a981a5e96ba95e94-2024-08-30-5.33.5    default                   4m53s
snapcontent-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-4a7d3f27fc5655a210c9dd2228c6d9f6db722446334ce9e0108f6f11640ffd75-2024-08-30-5.33.9    default                   4m50s
snapcontent-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-55c4157c9dfeb50d176631205bd87afaa7d60059d64ac26096361f009197063b-2024-08-30-5.33.1    default                   4m58s
snapcontent-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-7d10cafc30b34e5adf253ed8b57da6d0b4718fda80a4e4a048d7e584a31e1e2b-2024-08-30-5.33.2    default                   4m57s
snapcontent-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-817932568eb28072640eb89ecfca12ab4c4c7503589ba729e91dfc9efa1d50b6-2024-08-30-5.33.10   default                   4m49s
snapcontent-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-8dd24fbc3ffd96b1f879c2e8a92ed15edc364792186dd2740739cec1d7887365-2024-08-30-5.33.8    default                   4m51s
snapcontent-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-96fa767049e8bc9e62e60af70e373323f5c23fb5aa971224894c67ad98c60e09-2024-08-30-5.33.11   default                   4m47s
snapcontent-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    true         3221225472    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ac67c44cb87ea6968bd6075cc6a48981692fb38cf462c80794073db84a2a590b-2024-08-30-5.33.7    default                   4m52s
snapcontent-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    true         2147483648    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-d3fc2e9cdf6662234b820ddeabb8b32792ec17223020ca73da808d7487851779-2024-08-30-5.33.4    default                   4m55s
snapcontent-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    true         1073741824    Delete           rook-ceph.cephfs.csi.ceph.com                         snapshot-ef5edb9c4d1603d3a8688d52492cad28c5a6fec4bdb57dffc57f8f6c0e6dfae4-2024-08-30-5.33.3    default                   4m56s
yatipadia:Documents$ 
yatipadia:Documents$ kubectl get volumegroupsnapshot
NAME                       READYTOUSE   VOLUMEGROUPSNAPSHOTCLASS          VOLUMEGROUPSNAPSHOTCONTENT                              CREATIONTIME   AGE
new-groupsnapshot-demo-1   true         csi-cephfsplugin-groupsnapclass   groupsnapcontent-a767e7e9-46df-407e-b282-d0263d13e45e   5m8s           5m10s
yatipadia:Documents$ kubectl delete volumegroupsnapshot new-groupsnapshot-demo-1
volumegroupsnapshot.groupsnapshot.storage.k8s.io "new-groupsnapshot-demo-1" deleted
yatipadia:Documents$ kubectl get volumesnapshotcontent
No resources found
yatipadia:Documents$ kubectl get volumegroupsnapshot
No resources found in default namespace.
yatipadia:Documents$ kubectl get volumesnapshot
No resources found in default namespace.
yatipadia:Documents$ 

cc @Madhu-1

Madhu-1 commented 1 month ago

good to hear we dont have this bug anymore, in that case we can close it.

yati1998 commented 1 month ago

@madhu can you please close this issue as well.

yati1998 commented 3 weeks ago

@Madhu-1 can you re-open this issue, we can use the same issue to track the bug