ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.19k stars 528 forks source link

Enable encryption for ceph-csi using fscrypt #4597

Open pankaj-mandal opened 2 months ago

pankaj-mandal commented 2 months ago

Describe the bug

I have been trying to enable encryption for ceph-csi, one of the requirements is to enable fscrypt for the ceph storage. However the ceph osd stores use LVM and fscrypt uses ext4 and few others but not LVM so encryption cannot be enabled on the LVM devices.

Environment details

Steps to reproduce

Steps to reproduce the behavior:

  1. ceph cluster is deployed and cephfs, pools, osd's are configured. The cluster is healthy.

    ceph status
    cluster:
    id:     038167ca-076f-11ef-b000-c16a24702dee
    health: HEALTH_OK
    
    services:
    mon: 1 daemons, quorum ceph-4-single (age 13h)
    mgr: ceph-4-single.jojgfx(active, since 13h), standbys: ceph-4-single.ntnqjw
    mds: 1/1 daemons up, 1 standby
    osd: 3 osds: 3 up (since 12h), 3 in (since 12h)
    
    data:
    volumes: 1/1 healthy
    pools:   5 pools, 209 pgs
    objects: 30 objects, 783 KiB
    usage:   886 MiB used, 299 GiB / 300 GiB avail
    pgs:     209 active+clean

    ceph-csi is installed on another node using helm charts. All pods are up and running

    NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
    ceph-csi-cephfs      ceph-csi-cephfs-nodeplugin-wvzs5             3/3     Running   0          9h
    ceph-csi-cephfs      ceph-csi-cephfs-provisioner-86bf8dfc-46x4c   5/5     Running   0          9h
    ceph-csi-cephfs      csi-cephfs-demo-pod                          1/1     Running   0          7h28m
    kube-system          coredns-76f75df574-r2dl6                     1/1     Running   0          10h
    kube-system          coredns-76f75df574-tgdwv                     1/1     Running   0          10h
    kube-system          etcd-kind-control-plane                      1/1     Running   0          10h
    kube-system          kindnet-c2gqj                                1/1     Running   0          10h
    kube-system          kube-apiserver-kind-control-plane            1/1     Running   0          10h
    kube-system          kube-controller-manager-kind-control-plane   1/1     Running   0          10h
    kube-system          kube-proxy-nqnxj                             1/1     Running   0          10h
    kube-system          kube-scheduler-kind-control-plane            1/1     Running   0          10h
    local-path-storage   local-path-provisioner-7577fdbbfb-whtmn      1/1     Running   0          10h

    I have encryption set to false at this point in the storageclass. However if I enable encryption in storageclass, it will give an error in the demo pod i.e. the error is something like

    Events:
    Type     Reason       Age               From               Message
    ----     ------       ----              ----               -------
    Normal   Scheduled    11s               default-scheduler  Successfully assigned ceph-csi-cephfs/csi-cephfs-demo-pod to kind-control-plane
    Warning  FailedMount  12s               kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = panic runtime error: invalid memory address or nil pointer dereference
    Warning  FailedMount  4s (x4 over 11s)  kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false
  2. Deployment to trigger the issue '....' I have encryption set to false at this point in the storageclass. However if I enable encryption in storageclass, it will give an error in the demo pod

  3. See error

Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    11s               default-scheduler  Successfully assigned ceph-csi-cephfs/csi-cephfs-demo-pod to kind-control-plane
  Warning  FailedMount  12s               kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = panic runtime error: invalid memory address or nil pointer dereference
  Warning  FailedMount  4s (x4 over 11s)  kubelet            MountVolume.MountDevice failed for volume "pvc-465a3be2-0dfc-4c06-ae2a-b690cdf00ef5" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false

Actual results

I guess it is because fscrypt is not enabled in the storage i.e. on the server side. If I look at the volumes on server side. I see

lsblk -f
NAME                                              FSTYPE         FSVER    LABEL           UUID                                   FSAVAIL FSUSE% MOUNTPOINTS
loop0                                                                                                                                  0   100% /snap/core20/2264
loop1                                                                                                                                  0   100% /snap/google-cloud-cli/235
loop2                                                                                                                                  0   100% /snap/lxd/28373
loop3                                                                                                                                  0   100% /snap/snapd/21465
sda                                                                                                                                             
├─sda1                                            ext4           1.0      cloudimg-rootfs 8ed05f8a-f362-4937-bc52-8e21afbc835c      3.6G    62% /var/lib/containers/storage/overlay
│                                                                                                                                               /
├─sda14                                                                                                                                         
└─sda15                                           vfat           FAT32    UEFI            51E2-9280                                98.3M     6% /boot/efi
sdb                                               LVM2_member    LVM2 001                 QgHYur-fuZE-6Wgs-EQCj-fty6-Wj2Y-dgfogt                
└─ceph--e9acd4cb--15c4--4b48--a713--8cdd9e6c595b-osd--block--9de1730b--d96d--4cf6--b8e5--c533247b3f7b
                                                  ceph_bluestore                                                                                
sdc                                               LVM2_member    LVM2 001                 llxtff-KEmh-rhLh-WNys-L8y4-hABO-JDCZXD                
└─ceph--05b9c373--decc--4bea--a33c--f6f2e3b3d311-osd--block--8ccf132c--d14d--487d--b764--447e59118343
                                                  ceph_bluestore                                                                                
sdd                                               LVM2_member    LVM2 001                 2OI3tQ-IQZs-XotL-e7Kt-t2ka-gdFk-x1Gx8Y                
└─ceph--055041f3--c14f--46cc--a772--52c798acb929-osd--block--3663f324--3511--4e67--b68d--e253da711e2b
                                                  ceph_bluestore 

The devices sdb, sdc and sdd need to be encrypted. However the LVM cannot be encrypted using fscrypt as it is not supported by fscrypt

nixpanic commented 2 months ago

The Ceph-CSI project provides a CSI driver that a Container Platform like Kubernetes can use to create/delete volumes for application usage. The encryption that Ceph-CSI sets up is client-side, per volume. Ceph-CSI does not manage the Ceph cluster and OSDs. A project like Rook focuses on that.

For your case, you may want to check the Ceph documentation about encryption.

pankaj-mandal commented 2 months ago

The Ceph-CSI project provides a CSI driver that a Container Platform like Kubernetes can use to create/delete volumes for application usage. The encryption that Ceph-CSI sets up is client-side, per volume. Ceph-CSI does not manage the Ceph cluster and OSDs. A project like Rook focuses on that.

For your case, you may want to check the Ceph documentation about encryption.

Thanks for the update, I have enabled server side encryption as per the link you mentioned in Ceph documentation. I had earlier looked at examples in the git repo and it looked like encryption could be done using ceph-csi

I also noticed that even if I set the key "encrypted" to "false" in storageclass, the pvc will not bind. I have to remove that entry completely or comment it out. Also I have to remove the encryptionPassphrase from the secret.yaml or comment it out. Also the namespace in the examples is default but the namespace needed is ceph-csi-cephfs. I am assuming that with this and server side encryption enabled, there is nothing additional to be done in ceph csi as far as encryption of data at rest is concerned.

I did look at using Rook originally but eventually decided to deploy ceph as per ceph documentation. Will try Rook another time.

Madhu-1 commented 1 month ago

@pankaj-mandal there are 2 types of encryption

You need to decide on what exact encryption you are looking for

pankaj-mandal commented 1 month ago

@pankaj-mandal there are 2 types of encryption

  • Server-side encryption , where you will enable encryption on the ceph cluster and update csi to use the specific encryption method secure or CRC to connect to the ceph cluster and do all operations of secure port 3300
  • The second option is PV encryption where cephcsi will encrypt all the cephfs (its still in alpha state and not much tested and RBD PVC's created

You need to decide on what exact encryption you are looking for

This is what I did

ceph-volume lvm prepare --data /dev/sdb --dmcrypt
ceph-volume lvm activate activate 0 <osd-uuid>

and repeated the above for different values of --data and <osd-uuid> and that has enabled encryption for object stores in my ceph cluster. On the client side I removed the entries for encryptionPassphrase and encrypted in the secret and storageclass. Seems to work, although I would like to have a way to see the encrypted files on the disk.

nixpanic commented 1 month ago

The way you have setup encryption is on the OSD side, where the Ceph cluster stores its objects for the files and RBD-images. By inspecting the contents of the LogicalVolume, you have access to the unencrypted objects. It is just not trivial to select and combine the objects that present a single file. The format is Ceph specific, and not meant for humans to interact with it.

github-actions[bot] commented 3 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.