ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.29k stars 548 forks source link

RBD Async: Failed to mirrored Cloned PVC created from another PVC (PVC-PVC Clone) #2426

Open Madhu-1 opened 3 years ago

Madhu-1 commented 3 years ago

Failed to mirror cloned PVC created from the another PVC

Steps to Reproduce

Even i tried to mirror the cloned rbd images manually.

sh-4.4# rbd mirror image enable replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003 snapshot
2021-08-19T06:49:15.892+0000 7ff2b877e2c0 -1 librbd::api::Mirror: image_enable: mirroring is not enabled for the parent

sh-4.4# rbd mirror image enable replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp snapshot 
2021-08-19T06:52:31.379+0000 7fcfa8aca2c0 -1 librbd::api::Mirror: image_enable: mirroring is not enabled for the parent
sh-4.4# rbd info replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003     
rbd image 'csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003':
    size 1 GiB in 256 objects
    order 22 (4 MiB objects)
    snapshot_count: 0
    id: 1972acb15df07
    block_name_prefix: rbd_data.1972acb15df07
    format: 2
    features: layering, operations
    op_features: clone-child
    flags: 
    create_timestamp: Thu Aug 19 06:46:17 2021
    access_timestamp: Thu Aug 19 06:46:17 2021
    modify_timestamp: Thu Aug 19 06:46:17 2021
    parent: replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp@2a3f1d55-fdc6-424b-8e49-e46fa3ae9573
    overlap: 1 GiB
sh-4.4# rbd info replicapool-4/csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp
rbd image 'csi-vol-26d8b1f5-00b9-11ec-89fe-0242ac110003-temp':
    size 1 GiB in 256 objects
    order 22 (4 MiB objects)
    snapshot_count: 1
    id: 1972a3c5fd23a
    block_name_prefix: rbd_data.1972a3c5fd23a
    format: 2
    features: layering, deep-flatten, operations
    op_features: clone-parent, clone-child, snap-trash
    flags: 
    create_timestamp: Thu Aug 19 06:46:16 2021
    access_timestamp: Thu Aug 19 06:46:16 2021
    modify_timestamp: Thu Aug 19 06:46:16 2021
    parent: replicapool-4/csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003@e91ecb33-f05d-469d-831e-200247bbbd38
    overlap: 1 GiB
sh-4.4# rbd info replicapool-4/csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003
rbd image 'csi-vol-1b227fe2-00b9-11ec-89fe-0242ac110003':
    size 1 GiB in 256 objects
    order 22 (4 MiB objects)
    snapshot_count: 1
    id: 1972a663f3297
    block_name_prefix: rbd_data.1972a663f3297
    format: 2
    features: layering, operations
    op_features: clone-parent, snap-trash
    flags: 
    create_timestamp: Thu Aug 19 06:45:56 2021
    access_timestamp: Thu Aug 19 06:45:56 2021
    modify_timestamp: Thu Aug 19 06:45:56 2021
Madhu-1 commented 3 years ago

cc @ShyamsundarR

Madhu-1 commented 3 years ago

as we create a temporary clone we need to take care of image clone and image deletion properly. There should be no stale images on both cluster

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

humblec commented 2 years ago

@Madhu-1 this is marked against the 3.5.0 release, so please revisit the state.

ushitora-anqou commented 1 year ago

My team verified that clone PVC was mirrored to secondary Ceph cluster with v3.9.0 + attached POC patch. This patch removes two snap rm in the following PVC clone process.

https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/rbd-snap-clone.md#volume-cloning-datasource-pvc

The steps to reproduce are:

  1. Create a two Ceph clusters C1 and C2.
  2. Setting RBD mirror from C1 to C2 by using Rook's CephRBDMirror CR.
  3. Create a PVC(PVC1) in C1.
  4. Create a VolumeReplication(VR1) corresponding to PVC1 in C1.
  5. Create a cloned PVC, PVC2, from PVC1.
  6. Create a VolumeReplication(VR2) corresponding to PVC2 in C1.
  7. Run rbd mirror image enable <image> snapshot for the following RBD images. 7-1: The RBD image corresponding to PVC1 again. 7-2: The RBD image corresnponding to the intermediate RBD image between PVC1 and PVC2. 7-3: The RBD image corresponding to PVC2.
  8. Then both PVC1 and PVC2 are mirrored as expected.

We also verified that mirroring PVC2 didn't work with the plain v3.9.0. It's because step 7.2 and 7-3 failed. I suspect that it's due to the missing of both temporal RBD image snapshots. My patch skips the removal of these snapshots.

rbd-mirror-workaround-prototype.patch

Rakshith-R commented 1 year ago

hey @nbalacha, Can you please take a look at above comments ?

When mirroring a chain of rbd images, Do we need the intermediate rbd snapshots to be alive (not deleted/in trash) ? Will this also be fixed by the patch you are working on?

Cephcsi deletes all intermediate RBD snapshots.

cc @idryomov @pkalever