Closed vbatts closed 3 years ago
Hi @vbatts This is a known missing feature right now, which has been discussed recently, but there indeed seems to be little evidence or trail here on github. For a start, I would expect to find an entry in https://github.com/kata-containers/documentation/blob/master/Limitations.md (/me makes note...). There was some discussion when this https://github.com/kata-containers/documentation/issues/222 was being tried.
@xzr - do you remember if/where we got to with selinux kata discussion?
@grahamwhaley I think the consensus was that "someone will implement it when they have a chance" :P
Well, I created https://github.com/kata-containers/documentation/pull/253, but it is very slim. I had a look around at the docker docs and a few bits of the code (dockerd, runc), but I am finding it hard to locate anything meaty or definitive on how this is currently handled (architecturally for instance). Any pointers on what that setting actually enables and where most welcome @vbatts ;-), so we can use those in future 'how and where do we enable this' discussions.
the most of it is ensuring that the filesystem view of the container and the execution of commands are done in the correct selinux context. @rhatdan can give more pointers. Some host filesystems handle the selinux context better than others (i.e. btrfs doesn't have "native" support, so it requires a recursive restorecon -R
which causes a copy-up). Some bits will be different since the container execution is inside qemu, thankfully there is precedent of running qemu on a selinux enabled host (i don't have links to this off-hand).
The other piece not discussed above is that this is only just to execute the container from the host. This is not getting into having the guest kernel/system being selinux enabled. That's an adventure for another day.
Same issue with podman.
Workaround: podman run --security-opt label=disable
Could you give me the AVC messages you are seeing?
@rhatdan Here it is with log level set to info
:
# getenforce
Enforcing
# podman run --log-level=info --runtime /usr/bin/kata-runtime -it alpine sh
WARN[0000] Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled
INFO[0001] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist
INFO[0001] Got pod network &{Name:dazzling_lehmann Namespace:dazzling_lehmann ID:8b2afe344bfc47388be27bdfe2515f784421f8b6dbff6095c2989d05e7231dd6 NetNS:/var/run/netns/cni-bb772265-24b4-6a97-2720-7bc9aa8de714 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[] Bandwidth:<nil> IpRanges:[]}]}
INFO[0001] About to add CNI network cni-loopback (type=loopback)
INFO[0001] Got pod network &{Name:dazzling_lehmann Namespace:dazzling_lehmann ID:8b2afe344bfc47388be27bdfe2515f784421f8b6dbff6095c2989d05e7231dd6 NetNS:/var/run/netns/cni-bb772265-24b4-6a97-2720-7bc9aa8de714 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[] Bandwidth:<nil> IpRanges:[]}]}
INFO[0001] About to add CNI network podman (type=bridge)
INFO[0002] Running conmon under slice machine.slice and unitName libpod-conmon-8b2afe344bfc47388be27bdfe2515f784421f8b6dbff6095c2989d05e7231dd6.scope
INFO[0007] Got pod network &{Name:dazzling_lehmann Namespace:dazzling_lehmann ID:8b2afe344bfc47388be27bdfe2515f784421f8b6dbff6095c2989d05e7231dd6 NetNS:/var/run/netns/cni-bb772265-24b4-6a97-2720-7bc9aa8de714 Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[] Bandwidth:<nil> IpRanges:[]}]}
INFO[0007] About to del CNI network podman (type=bridge)
Error: rpc error: code = Unknown desc = selinux label is specified in config, but selinux is disabled or not supported: OCI runtime error
For now, I have interpreted this as "not supported"
@rhatdan Here is what I can find in audit.log
:
type=ANOM_PROMISCUOUS msg=audit(1575020565.541:2970): dev=vethd7d77f80 prom=256 old_prom=0 auid=0 uid=0 gid=0 ses=10AUID="root" UID="root" GID="root"
type=NETFILTER_CFG msg=audit(1575020565.561:2971): table=nat family=2 entries=138
type=NETFILTER_CFG msg=audit(1575020565.563:2972): table=nat family=2 entries=140
type=NETFILTER_CFG msg=audit(1575020565.566:2973): table=nat family=2 entries=141
type=NETFILTER_CFG msg=audit(1575020565.569:2974): table=nat family=2 entries=142
type=NETFILTER_CFG msg=audit(1575020565.590:2975): table=filter family=2 entries=255
type=NETFILTER_CFG msg=audit(1575020565.593:2976): table=filter family=2 entries=256
type=NETFILTER_CFG msg=audit(1575020568.818:2977): table=filter family=2 entries=257
type=NETFILTER_CFG msg=audit(1575020568.819:2978): table=filter family=2 entries=256
type=NETFILTER_CFG msg=audit(1575020568.827:2979): table=nat family=2 entries=143
type=NETFILTER_CFG msg=audit(1575020568.829:2980): table=nat family=2 entries=145
type=NETFILTER_CFG msg=audit(1575020568.830:2981): table=nat family=2 entries=143
type=NETFILTER_CFG msg=audit(1575020568.833:2982): table=nat family=2 entries=145
type=NETFILTER_CFG msg=audit(1575020568.834:2983): table=nat family=10 entries=133
type=NETFILTER_CFG msg=audit(1575020568.837:2984): table=nat family=10 entries=135
type=NETFILTER_CFG msg=audit(1575020568.838:2985): table=nat family=10 entries=133
type=NETFILTER_CFG msg=audit(1575020568.840:2986): table=nat family=10 entries=135
type=ANOM_PROMISCUOUS msg=audit(1575020568.854:2987): dev=vethd7d77f80 prom=0 old_prom=256 auid=0 uid=0 gid=0 ses=10AUID="root" UID="root" GID="root"
type=NETFILTER_CFG msg=audit(1575020568.865:2988): table=nat family=2 entries=143
type=NETFILTER_CFG msg=audit(1575020568.873:2989): table=nat family=2 entries=142
type=NETFILTER_CFG msg=audit(1575020568.875:2990): table=nat family=2 entries=140
Looks like kata has to be rebuilt with SELinux support. Or at least to ignore the label when built without SELinux support.
@rhatdan - as noted earlier in the thread, the kata limitations document says kata does not currently support the selinux option. It's not quite as simple as rebuild Kata to turn it on or ignore it - it's not coded up in kata....
And then, would you want kata to silently ignore an selinux
option if it was passed in? I'm not sure. I'd not call that the 'path of least surprise'.
Now, if somebody wants to undertake coding up selinux support in kata - that'd be great, and I'm sure there are folks who would help discuss and review :-)
I would just warn in Kata that this is not currently supported. There is no mechanism for Podman to know, and forcing users to understand the difference is difficult, and perhaps impossible. Currently if you tell --security-opt label:disabled, does kata work?
@rhatdan
Currently if you tell --security-opt label:disabled, does kata work?
Yes, it does. See also related Bugzilla
Ok, then I guess podman does not send down a label in that case.
The difficult thing is this is hard for users to understand. IE Some containers run fine with SELinux enabled, but kata fails.
One question I would have is what is the label of the procesess running the VM.
If it is running qemu? What is the label ps -eZ | grep qemu
We really should get this labeled correctly, so we could take advantage of SELinux separation on VMs.
@rhatdan,
Providing you the info asked in December:
# ps -eZ | grep qemu
unconfined_u:system_r:container_runtime_t:s0 5892 ? 00:00:03 qemu-kvm
So this means that podman executed qemu-kvm directly. What AVC's are you seeing when you run this? We could transition qemu-kvm to a better domain. like svirt_t.
@rhatdan, I don't see any particular AVC, but basically the same log as pointed by @c3d in this comment https://github.com/kata-containers/runtime/issues/784#issuecomment-559726072
Can we get Kata to just warn rather then throw an error? I don't want to put something into Podman to identify which container runtimes support which features. I would figure there are other parts of the OCI that kata ignores, since it does not currently implement this.
If we really want to examine what I believe kata should be doing with SELinux is to launch the qemu (or what ever process launches the VM, with an SELinux label. In the best senario it could launch the process as svirt_t:MCS, which it could figure out by calling virtual_domain_context(), and then launching with the MCS label in the OCI Spec. This might not work, though, since the image label might not be correct.
@rhatdan I have opened a PR (https://github.com/kata-containers/runtime/pull/2443) to disable selinux while support for it is added. This should unblock running Kata on systems with selinux enforced.
just for ref @rhatdan , afaik kata is not missing many OCI features, and afaik those it does not support, it fails on. Limitations documented here. I'm not a fan of fairly-silently ignoring a security feature request. I'm more a fan of the 'path of least surprise'.
Sure, we have to work on getting SELinux implemented. But we are going to need a different strategy to get this going, mainly because the container_t label will only work for Namespace based containers, and will not work for KVM based containers.
I hope to have some more time to try to play with Kata and SELinux.
Description of problem
running docker with selinux enabled (
/etc/docker/daemon.json
of"selinux-enabled": true,
) and on a centos7 host with selinux enabled.Expected result
a shell
Actual result
Meta details
Running
kata-collect-data.sh
version1.3.0-rc1 (commit 22aedc4)
at2018-09-25.04:51:31.258811160-0400
.Runtime is
/bin/kata-runtime
.kata-env
Output of "
/bin/kata-runtime kata-env
":Runtime config files
Runtime default config files
Runtime config file contents
Config file
/etc/kata-containers/configuration.toml
not found Output of "cat "/usr/share/defaults/kata-containers/configuration.toml"
":Image details
Initrd details
No initrd
Logfiles
Runtime logs
No recent runtime problems found in system journal.
Proxy logs
No recent proxy problems found in system journal.
Shim logs
No recent shim problems found in system journal.
Container manager details
Have
docker
Docker
Output of "
docker version
":Output of "
docker info
":Output of "
systemctl show docker
":No
kubectl
Packages
No
dpkg
Haverpm
Output of "rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-containers-image|linux-container|qemu-)"
":