ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.26k stars 536 forks source link

RBD Regional DR: Test/Support mirroring of Thick PVC #2121

Closed Madhu-1 closed 3 years ago

Madhu-1 commented 3 years ago

As CephCSI supports the creation of Thick PVC. we need to support mirroring same rbd images in Regional DR

Madhu-1 commented 3 years ago

Just tested this one. Here is the flow

Primary cluster

sh-4.4# rbd du replicapool/csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005
warning: fast-diff map is not enabled for csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005. operation may be slow.
NAME                                          PROVISIONED  USED 
csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005       10 GiB  8 GiB
sh-4.4# rbd image-meta ls replicapool/csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005
There is 1 metadatum on this image:

Key                                  Value
.rbd.csi.ceph.com/thick-provisioned  true 
sh-4.4# rbd info  replicapool/csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005
rbd image 'csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005':
    size 10 GiB in 2560 objects
    order 22 (4 MiB objects)
    snapshot_count: 1
    id: 10f55681c784
    block_name_prefix: rbd_data.10f55681c784
    format: 2
    features: layering
    op_features: 
    flags: 
    create_timestamp: Thu Jun 24 07:51:07 2021
    access_timestamp: Thu Jun 24 07:54:33 2021
    modify_timestamp: Thu Jun 24 07:52:13 2021
    mirroring state: enabled
    mirroring mode: snapshot
    mirroring global id: 35163568-c128-4d73-8593-ad15dfa6f7e5
    mirroring primary: true

Secondary cluster

sh-4.4# rbd du  replicapool/csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005
warning: fast-diff map is not enabled for csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005. operation may be slow.
NAME                                          PROVISIONED  USED  
csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005       10 GiB  10 GiB
sh-4.4# rbd image-meta ls replicapool/csi-vol-ee43c5e1-d4c0-11eb-a392-0242ac110005
There are 0 metadata on this image.

As expected the image gets mirrored to the second site and i can see the Used size 10 Gib but the image meta-data is not mirrored to the second cluster.

@nixpanic do you see any other features that might not work if the image metadata is not mirrored to second cluster (like encryption)

@idryomov RBD expected to mirror both the image and its metadata to the secondary site? or RBD will only mirror the image and data, not the metadata set on the image?

Madhu-1 commented 3 years ago
 $ ceph version
ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable)
humblec commented 3 years ago

Snapshot mirroing says The remote cluster will determine any data or metadata updates between two mirror-snapshots and copy the deltas to its local copy of the image. , so its supposed to mirror both data and metadata.

idryomov commented 3 years ago

Correct, but keys starting with ".rbd" are considered internal and not mirrored. Such keys are not copied when cloning or deep-copying locally either.

Madhu-1 commented 3 years ago

@idryomov for the confirmation.

Currently, these are the metadata keys used by cephcsi to set on rbd images

// Encryption

// image metadata key for encryption
encryptionMetaKey = ".rbd.csi.ceph.com/encrypted"
// metadataDEK is the key in the image metadata where the (encrypted) DEK is stored.
metadataDEK = ".rbd.csi.ceph.com/dek"

// Thick Provisioner
// image metadata key for thick-provisioning
thickProvisionMetaKey = ".rbd.csi.ceph.com/thick-provisioned"

To support DR for both mirroring and thick PVC we need to rename .rbd.csi.ceph.com to rbd.csi.ceph.com. @nixpanic do you think any other changes are required? for backward compatibility, we need to support older keys for few releases. and clearly document that mirroring of Thick and encryption created `3.4.0 onwards (as new keys will get mirrored to other side)

humblec commented 3 years ago

Correct, but keys starting with ".rbd" are considered internal and not mirrored. Such keys are not copied when cloning or deep-copying locally either.

Thanks @idryomov .. Thats interesting :). If thats the case, may be we have to revisit the code paths ( cloning, deep cp , thick provisioning) where Keys embedded with .rbd on volumes and make sure we are not landing into surprise side effects.

@idryomov Is it just .rbd get special treatment ? or any other key format we should be aware of ?

idryomov commented 3 years ago

Yes, anything that starts with .rbd. For example snapshot-based mirroring uses .rbd_mirror.<mirror_uuid> key for storing image-specific mirroring state such as whether resync has been requested.