Open ryanm101 opened 11 months ago
Hi @ryanm101
I found a bit similar error here: https://github.com/intel/intel-technology-enabling-for-openshift/issues/113. There are a couple of workarounds in the issue that could work. Could you try them out?
I reproduced the issue on a VM. Device plugin seems to work without selinux but fails with selinux. In the selinux audit logs there is an entry:
type=AVC msg=audit(1702889339.432:3913): avc: denied { connectto } for pid=16332 comm="intel_gpu_devic" path="/var/lib/kubelet/device-plugins/kubelet.sock" scontext=system_u:system_r:container_device_plugin_t:s0:c620,c968 tcontext=system_u:system_r:container_runtime_t:s0 tclass=unix_stream_socket permissive=0
I'll need to study if this is similar/same as the above linked issue.
EDIT: using setenforce 0
is a workaround. Though, not plausible if selinux is required.
setenforce 0
corrects it but Nuc1&3 are both enforcing and working fine.
I followed instructions from the audit entry:
sudo ausearch -c 'intel_gpu_devic' --raw | audit2allow -M intelgpudevice
sudo semodule -X 300 -i intelgpudevice.pp
That seems to allow device plugin to access kubelet. I'm not sure where we should file a bug to: FC, k3s or somewhere else.
The plugins already run with proper label to have access to kubelet. That policy went into container-selinux package. Is that package installed on your node?
Those get installed alongside k3s. and are installed.
I followed instructions from the audit entry:
sudo ausearch -c 'intel_gpu_devic' --raw | audit2allow -M intelgpudevice sudo semodule -X 300 -i intelgpudevice.pp
That seems to allow device plugin to access kubelet. I'm not sure where we should file a bug to: FC, k3s or somewhere else.
Yes this seems to solve it.
@mregmi do you happen to know the container-selinux version?
@tkatila Was this SELinux issue already handled?
Running 3 master nodes using k3s NUC 1 & 3 both deploy fine. NUC 2 the container crashes with
command used to provision NUC2:
The only differences between NUC2 and NUC1/3 are:
Any advice appreciated. I will test re-adding the node without the
--selinux
and if all else fails change it to FC38.