Open cjorge-graphops opened 2 months ago
@cjorge-graphops Just trying to understand here from the CSIDriver point of view, the changes needed are just about adding seLinuxMount : true
here https://github.com/openebs/zfs-localpv/blob/develop/deploy/helm/charts/templates/csidriver.yaml ?
Also, this seems to be in beta, and needs to be enabled using feature gates right?
What is the impact when run on non selinux enabled os?
@Abhinandan-Purkait my understanding of it is
based on this:
seLinuxMount
This field is alpha in Kubernetes 1.25. It must be explicitly enabled by setting feature gates
ReadWriteOncePod and SELinuxMountReadWriteOncePod.
The default value of this field is false.
When set to true, corresponding CSI driver announces that all its volumes are independent
volumes from Linux kernel point of view and each of them can be mounted with a different SELinux
label mount option (-o context=<SELinux label>).
Examples:
A CSI driver that creates block devices formatted with a filesystem, such as xfs or ext4, can set
seLinuxMount: true, because each volume has its own block device.
A CSI driver whose volumes are always separate exports on a NFS server can set seLinuxMount: true,
because each volume has its own NFS export and thus Linux kernel treats them as independent volumes.
A CSI driver that can provide two volumes as subdirectories of a common NFS export must set
seLinuxMount: false, because these two volumes are treated as a single volume by Linux kernel and
must share the same -o context=<SELinux label> option.
Mounting with SELinux context
When all aforementioned conditions are met, kubelet will pass -o context=<SELinux label> mount option
to the volume plugin or CSI driver. CSI driver vendors must ensure that this mount option is supported by
their CSI driver and, if necessary, the CSI driver appends other mount options that are needed for
-o context to work.
For example, NFS may need -o context=<SELinux label>,nosharecache, so each volume mounted from the
same NFS server can have a different SELinux label value. Similarly, CIFS may need
-o context=<SELinux label>,nosharesock.
It's up to the CSI driver vendor to test their CSI driver in a SELinux enabled environment before setting
seLinuxMount: true in the CSIDriver instance.
The fact that kubernetes will only attempt to use this when it detects the OS does have SeLinux support, and ZFS supports SeLinux, I think so. It would come down to advertising it and handling it ok upon having some extra mount options for the context passed/handled (the -o context)
But my knowledge of these interfaces is quite non-existant, this is mostly guess-work and an educated guess at this point, so take it with a grain of salt.
Also, this seems to be in beta, and needs to be enabled using feature gates right?
Yes, that link I posted specifies all the conditions that need to come together for kubernetes to even pass this context to the CSIDriver. For instance, besides the two feature gates that are still beta, the POD also needs to at least specify selinux.level
. Don't think anyone is going to run into this unintentionally at this point.
What is the impact when run on non selinux enabled os?
The CSI driver spec mentions that kubernetes does check if the OS supports SeLinux, and won't even attempt to pass -o context if that requirement is not satisfied, so I don't think that burden falls on zfs-localpv
@cjorge-graphops Are you interested in contributing to this? Let's keep it open to see views from community.
@Abhinandan-Purkait sorry for the late reply. I most definitely need to eventually address this for my own use-cases so the answer is yes, as I said do have workloads that use millions of small files on SeLinux enabled systems... and having PODs take upwards of 30m to restart is not a viable status-quo. I'm interested in contributing at least after I've validated, and I'm sure to get to it at some point, but may take me some time however as it is not the highest of priorities for me. Thank you!
@cjorge-graphops Thanks, I guess to start with you can at least add a design document with your validation, and then it can be worked upon by anyone interested.
@cjorge-graphops any updates on your end?
Describe the problem/challenge you have For PVCs with millions of small files, relabeling them takes an enormous amount of time
Describe the solution you'd like Kubernetes tries to address this as described here: https://kubernetes.io/blog/2023/04/18/kubernetes-1-27-efficient-selinux-relabeling-beta/ but zfs-localpv would need to support mounting with SeLinux mount options, and the CSI driver would need to advertise that