Closed lwj5 closed 1 year ago
@lwj5 Can you try adding security context to the deployment pod? https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods
This is the deployment used for the test. Let me know what you would like changed
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
...
name: ceph
namespace: test
spec:
replicas: 2
...
template:
spec:
affinity: {}
containers:
- command:
- sleep
- "10000"
image: debian:bullseye-slim
imagePullPolicy: IfNotPresent
name: container
securityContext:
allowPrivilegeEscalation: false
capabilities: {}
privileged: false
readOnlyRootFilesystem: false
runAsUser: 1001
volumeMounts:
- mountPath: /ceph
name: vol-igx3b
...
securityContext:
fsGroup: 1001
volumes:
- name: vol-igx3b
persistentVolumeClaim:
claimName: cephfs
@humblec Can please take a look at this once?
We're seeing identical symptoms to the description above. For our case, it looks like SELinux category changes are resposible for the permission denial.
A ReadWriteMany cephfs volume on the first pod looks fine:
cms-jovyan@example:~$ ls -alZ /mnt
total 0
drwxrwxrwx. 10 root root system_u:object_r:container_file_t:s0:c31,c859 8 Dec 7 10:57 .
dr-xr-xr-x. 1 root root system_u:object_r:container_file_t:s0:c31,c859 50 Dec 7 18:09 ..
drwxr-xr-x. 3 cms-jovyan 11265 system_u:object_r:container_file_t:s0:c31,c859 3 Dec 6 18:41 densenet_onnx
drwxr-xr-x. 2 cms-jovyan 11265 system_u:object_r:container_file_t:s0:c31,c859 2 Dec 7 17:09 inception_graphdef
After a second pod mounts the volume, the first pod loses access:
cms-jovyan@example:~$ ls -alZ /mnt
ls: cannot open directory '/mnt': Permission denied
After ignoring the dontaudit rules (semodule -DB
) we see a denial in the logs.
type=AVC msg=audit(1670436948.985:78042): avc: denied { read } for pid=1851459 comm="ls" name="/" dev="ceph" ino=1099511627786 scontext=system_u:system_r:container_t:s0:c31,c859 tcontext=system_u:object_r:container_file_t:s0:c136,c663 tclass=dir permissive=0
After running setenforce 0
on the host system, we can again access the mount. Setting container_use_cephfs
did not help.
cms-jovyan@example:~$ ls -alZ /mnt
total 0
drwxrwxrwx. 10 root root system_u:object_r:container_file_t:s0:c136,c663 8 Dec 7 10:57 .
dr-xr-xr-x. 1 root root system_u:object_r:container_file_t:s0:c31,c859 50 Dec 7 18:09 ..
drwxr-xr-x. 3 cms-jovyan 11265 system_u:object_r:container_file_t:s0:c136,c663 3 Dec 6 18:41 densenet_onnx
drwxr-xr-x. 2 cms-jovyan 11265 system_u:object_r:container_file_t:s0:c136,c663 2 Dec 7 17:09 inception_graphdef
I suspect when the second pod starts up, it causes a relabeling and changes the SELinux categories on the contents, resulting in a denial for the first pod.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
Not stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
still present
@lwj5 this will be a problem with the SELinux relabelling on the cephfs mounts. When you create the second pod, the cephfs mountpoint will get relabelled with the second pod SELinux labels, and the first pod will start getting permission denied error. You must ensure that all the application pods using the same cephfs PVC should use the same SELinux labels.
Thanks @Madhu-1 for your solution and @jthiltges for the diagnostic, this means that a static SELinux level must be set for every deployment.
I wish there could be an better way. But nonetheless, thanks for the input.
Thank you all, and I appreciate the info, though this is disappointing news. Having to set pods to the same category weakens the security benefits of SELinux.
It would be less surprising if a ReadWriteMany
mode could be treated like a Docker volume with :z
, resulting in a shared content label.
Describe the bug
When creating a deployment with a ReadWriteMany cephFS PVC, only the last created pod has access to the mounted folder. The earlier pods will show Permission denied when
cd
to that directory.If deployment is set to privilege, this does not happen. I've looked at this https://github.com/ceph/ceph-csi/issues/1097 but there is no denial in SELinux and I've set container_use_cephfs=1 as well to test.
Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : kernelSteps to reproduce
Steps to reproduce the behavior:
Actual results
Permission denied on any earlier pods.
Expected behavior
All pods can access the volume
Logs
If the issue is in PVC mounting please attach complete logs of below containers.
csi-cephfsplugin logs for node of 1st replica
csi-cephfsplugin logs for node of 2nd replica
Additional context
PVC in question is
pvc-74206ef3-4bcd-40e6-af26-40ec2b994a23
ID of first podc3f729fe-2c56-41bf-aa53-b70e5ee17f10
ID of second pod68b1501c-e845-4f21-9abd-fb587d1da658
See https://github.com/ceph/ceph-csi/issues/1097