Open siavashs opened 6 years ago
It looks like selinux category options are confusing the mount-related logic. Can you please try with latest rkt (1.30) to check if that logic changed in the meanwhile?
I'm not exactly an selinux expert, but I think in this case it would be enough to stop the mount context at "level" granularity and just drop the categories.
I installed 1.30 from github, now rkt hangs here:
[!!!!!!] Failed to allocate manager object, freezing.
I tried again but set selinux to permissive mode and captured these:
type=AVC msg=audit(1525722673.810:465): avc: denied { write } for pid=6446 comm="systemd" name="machine-rkt\x2d93cbc2ee\x2df550\x2d41a3\x2dba29\x2dba1076046f3b.scope" dev="cgroup2" ino=2792 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.810:466): avc: denied { add_name } for pid=6446 comm="systemd" name="init.scope" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.810:467): avc: denied { create } for pid=6446 comm="systemd" name="init.scope" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.811:468): avc: denied { write } for pid=6446 comm="systemd" name="cgroup.procs" dev="cgroup2" ino=2805 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.818:469): avc: denied { setattr } for pid=6446 comm="systemd" name="system.slice" dev="cgroup2" ino=2814 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.834:470): avc: denied { remove_name } for pid=6446 comm="systemd" name="sysusers.service" dev="cgroup2" ino=2836 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.834:471): avc: denied { rmdir } for pid=6446 comm="systemd" name="sysusers.service" dev="cgroup2" ino=2836 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.859:472): avc: denied { mounton } for pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/kallsyms" dev="proc" ino=4026532081 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:system_map_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.860:473): avc: denied { mounton } for pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/kcore" dev="proc" ino=4026532047 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_kcore_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.861:474): avc: denied { mounton } for pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sched_debug" dev="proc" ino=4026532077 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.862:475): avc: denied { mounton } for pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sys/kernel/core_pattern" dev="proc" ino=88718 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:usermodehelper_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.862:476): avc: denied { mounton } for pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sys/vm/panic_on_oom" dev="proc" ino=91623 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:sysctl_vm_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.868:477): avc: denied { remount } for pid=6459 comm="(etcd)" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_t:s0 tclass=filesystem permissive=1
type=AVC msg=audit(1525722673.885:478): avc: denied { sendto } for pid=6446 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1525722673.886:479): avc: denied { remount } for pid=6459 comm="(etcd)" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:sysfs_t:s0 tclass=filesystem permissive=1
Yes, there are multiple issues with selinux and rkt stage1 setup, thus it pretty much never worked in enforcing mode: https://github.com/rkt/rkt/labels/technology%2Fselinux. However, if it fails in permissive mode then it is likely a bug which will have to be fixed.
Did you eventually manage to make rkt-1.30 run in permissive mode?
The pod starts successfully in permissive mode.
Just some observations. Based on the logs above, this instance of systemd (the one from rkt I guess) and etcd are running with a container_t
context. That comes from Fedora's SELinux policy. I guess it is a default of some kind for containers started via systemd-nspawn, as rkt does. That systemd in the log tries to write to machine-rkt\x2d93cbc2ee\x2df550\x2d41a3\x2dba29\x2dba1076046f3b.scope
, but it can't because that has a context of cgroup_t
, and the Fedora SELinux policy does not allow container_t
's to write to cgroup_t
's. How does machine-XXX.scope
get that cgroup_t
label? Is it from rkt, or a default from the Fedora SELinux policy? Is the answer to add rkt support to Fedora's SELinux policy?
Just ftr for all fedora users that encounter this same issue, it's tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1443067.
I wasn't able to find anything else in fedora's bug tracker other than rkt's maintainer trying to pass off ownership (https://bugzilla.redhat.com/show_bug.cgi?id=1532447), so building is probably gonna be the way to go from here on out until fedora's version gets bumped.
So, the way to fix this on fedora is pretty much to download the latest release at https://github.com/rkt/rkt/releases, and install that .rpm.
Selinux is a different issue entirely and unfortunately I failed at finding other reports related to those developments in order to track them in this thread. If someone else finds more issues to track, please add them so that we can eventually get this issue closed.
(edited to mention the addition of issues for tracking selinux related developments)
Hey y'all, found an interesting project that's being worked on that assists with selinux profiles for containers. If you're one of those that don't want to just disable selinux entirely (like me), you might find it an interesting project to look at: https://github.com/containers/udica
Environment
What did you do?
What did you expect to see? The etcd pod running.
What did you see instead?
Related
dmesg
output: