rkt / rkt

[Project ended] rkt is a pod-native container engine for Linux. It is composable, secure, and built on standards.
Apache License 2.0
8.82k stars 883 forks source link

stage1: wrong selinux context option for mount #3927

Open siavashs opened 6 years ago

siavashs commented 6 years ago

Environment

rkt Version: 1.25.0+gitd069322
appc Version: 0.8.10
Go Version: go1.10rc2
Go OS/Arch: linux/amd64
Features: +TPM +SDJOURNAL
--
Linux 4.16.6-302.fc28.x86_64 x86_64
--
NAME=Fedora
VERSION="28 (Workstation Edition)"
ID=fedora
VERSION_ID=28
PLATFORM_ID="platform:f28"
PRETTY_NAME="Fedora 28 (Workstation Edition)"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:28"
HOME_URL="https://fedoraproject.org/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=28
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=28
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
--
systemd 238
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

What did you do?

$ sudo dnf install rkt
$ sudo rkt run coreos.com/etcd:v3.1.7

What did you expect to see? The etcd pod running.

What did you see instead?

Failed to mount tmpfs on /var/lib/rkt/pods/run/f0e5abaf-5ac0-4422-834d-908922fbb37f/stage1/rootfs/tmp (MS_NOSUID|MS_NODEV|MS_STRICTATIME "mode=1777,context=system_u:object_r:container_file_t:s0:c230,c638"): Invalid argument

Related dmesg output:

[25953.337093] tmpfs: No value for mount option 'c638'
lucab commented 6 years ago

It looks like selinux category options are confusing the mount-related logic. Can you please try with latest rkt (1.30) to check if that logic changed in the meanwhile?

I'm not exactly an selinux expert, but I think in this case it would be enough to stop the mount context at "level" granularity and just drop the categories.

siavashs commented 6 years ago

I installed 1.30 from github, now rkt hangs here:

[!!!!!!] Failed to allocate manager object, freezing.

I tried again but set selinux to permissive mode and captured these:

type=AVC msg=audit(1525722673.810:465): avc:  denied  { write } for  pid=6446 comm="systemd" name="machine-rkt\x2d93cbc2ee\x2df550\x2d41a3\x2dba29\x2dba1076046f3b.scope" dev="cgroup2" ino=2792 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.810:466): avc:  denied  { add_name } for  pid=6446 comm="systemd" name="init.scope" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.810:467): avc:  denied  { create } for  pid=6446 comm="systemd" name="init.scope" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.811:468): avc:  denied  { write } for  pid=6446 comm="systemd" name="cgroup.procs" dev="cgroup2" ino=2805 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.818:469): avc:  denied  { setattr } for  pid=6446 comm="systemd" name="system.slice" dev="cgroup2" ino=2814 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.834:470): avc:  denied  { remove_name } for  pid=6446 comm="systemd" name="sysusers.service" dev="cgroup2" ino=2836 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.834:471): avc:  denied  { rmdir } for  pid=6446 comm="systemd" name="sysusers.service" dev="cgroup2" ino=2836 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1525722673.859:472): avc:  denied  { mounton } for  pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/kallsyms" dev="proc" ino=4026532081 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:system_map_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.860:473): avc:  denied  { mounton } for  pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/kcore" dev="proc" ino=4026532047 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_kcore_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.861:474): avc:  denied  { mounton } for  pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sched_debug" dev="proc" ino=4026532077 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.862:475): avc:  denied  { mounton } for  pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sys/kernel/core_pattern" dev="proc" ino=88718 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:usermodehelper_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.862:476): avc:  denied  { mounton } for  pid=6459 comm="(etcd)" path="/opt/stage2/etcd/rootfs/proc/sys/vm/panic_on_oom" dev="proc" ino=91623 scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:sysctl_vm_t:s0 tclass=file permissive=1
type=AVC msg=audit(1525722673.868:477): avc:  denied  { remount } for  pid=6459 comm="(etcd)" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:proc_t:s0 tclass=filesystem permissive=1
type=AVC msg=audit(1525722673.885:478): avc:  denied  { sendto } for  pid=6446 comm="systemd" path="/systemd/nspawn/notify" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_dgram_socket permissive=1
type=AVC msg=audit(1525722673.886:479): avc:  denied  { remount } for  pid=6459 comm="(etcd)" scontext=system_u:system_r:container_t:s0:c374,c875 tcontext=system_u:object_r:sysfs_t:s0 tclass=filesystem permissive=1
lucab commented 6 years ago

Yes, there are multiple issues with selinux and rkt stage1 setup, thus it pretty much never worked in enforcing mode: https://github.com/rkt/rkt/labels/technology%2Fselinux. However, if it fails in permissive mode then it is likely a bug which will have to be fixed.

Did you eventually manage to make rkt-1.30 run in permissive mode?

siavashs commented 6 years ago

The pod starts successfully in permissive mode.

glevand commented 6 years ago

Just some observations. Based on the logs above, this instance of systemd (the one from rkt I guess) and etcd are running with a container_t context. That comes from Fedora's SELinux policy. I guess it is a default of some kind for containers started via systemd-nspawn, as rkt does. That systemd in the log tries to write to machine-rkt\x2d93cbc2ee\x2df550\x2d41a3\x2dba29\x2dba1076046f3b.scope, but it can't because that has a context of cgroup_t, and the Fedora SELinux policy does not allow container_t's to write to cgroup_t's. How does machine-XXX.scope get that cgroup_t label? Is it from rkt, or a default from the Fedora SELinux policy? Is the answer to add rkt support to Fedora's SELinux policy?

arizvisa commented 5 years ago

Just ftr for all fedora users that encounter this same issue, it's tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1443067.

I wasn't able to find anything else in fedora's bug tracker other than rkt's maintainer trying to pass off ownership (https://bugzilla.redhat.com/show_bug.cgi?id=1532447), so building is probably gonna be the way to go from here on out until fedora's version gets bumped.

So, the way to fix this on fedora is pretty much to download the latest release at https://github.com/rkt/rkt/releases, and install that .rpm.

Selinux is a different issue entirely and unfortunately I failed at finding other reports related to those developments in order to track them in this thread. If someone else finds more issues to track, please add them so that we can eventually get this issue closed.

(edited to mention the addition of issues for tracking selinux related developments)

arizvisa commented 5 years ago

Hey y'all, found an interesting project that's being worked on that assists with selinux profiles for containers. If you're one of those that don't want to just disable selinux entirely (like me), you might find it an interesting project to look at: https://github.com/containers/udica