Closed GabyCT closed 3 years ago
Error unpacking rpm package shadow-utils-2:4.6-17.fc31.x86_64
error: unpacking of archive failed on file /usr/bin/newgidmap;5e864da7: cpio: cap_set_file
error: shadow-utils-2:4.6-17.fc31.x86_64: install failed
So, same root cause as https://github.com/kata-containers/tests/issues/2358. Maybe it's better to re-open that one as we already have some debug info there.
As @alicefr mentioned, this may be a guest kernel issue. We need to investigate.
Here are the logs
# Meta details
Running `kata-collect-data.sh` version `1.11.0-alpha1 (commit 705713b4f9bc4d1e754871d5ef1ab5e99ea71aff)` at `2020-04-02.21:18:51.379913370+0000`.
---
Runtime is `/usr/local/bin/kata-runtime`.
# `kata-env`
Output of "`/usr/local/bin/kata-runtime kata-env`":
```toml
[Meta]
Version = "1.0.24"
[Runtime]
Debug = true
Trace = false
DisableGuestSeccomp = true
DisableNewNetNs = false
SandboxCgroupOnly = false
Path = "/usr/local/bin/kata-runtime"
[Runtime.Version]
OCI = "1.0.1-dev"
[Runtime.Version.Version]
Semver = "1.11.0-alpha1"
Major = 1
Minor = 11
Patch = 0
Commit = "705713b4f9bc4d1e754871d5ef1ab5e99ea71aff"
[Runtime.Config]
Path = "/usr/share/defaults/kata-containers/configuration.toml"
[Runtime.Version.Version]
Semver = "1.11.0-alpha1"
Major = 1
Minor = 11
Patch = 0
Commit = "705713b4f9bc4d1e754871d5ef1ab5e99ea71aff"
[Runtime.Config]
Path = "/usr/share/defaults/kata-containers/configuration.toml"
[Hypervisor]
MachineType = "pc"
Version = "QEMU emulator version 4.1.1 (kata-static)\nCopyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers"
Path = "/usr/bin/qemu-system-x86_64"
BlockDeviceDriver = "virtio-scsi"
EntropySource = "/dev/urandom"
SharedFS = "virtio-9p"
VirtioFSDaemon = "/usr/bin/virtiofsd"
Msize9p = 8192
MemorySlots = 10
PCIeRootPort = 0
HotplugVFIOOnRootBus = false
Debug = true
UseVSock = false
[Image]
Path = "/usr/share/kata-containers/kata-containers-clearlinux-32740-osbuilder-891b61c-agent-73afd1a.img"
[Kernel]
Path = "/usr/share/kata-containers/vmlinuz-5.4.15-71"
Parameters = "systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket agent.log=debug agent.log=debug"
[Initrd]
Path = ""
[Proxy]
Type = "kataProxy"
Path = "/usr/libexec/kata-containers/kata-proxy"
Debug = true
[Proxy.Version]
Semver = "1.11.0-alpha1-e7d2214f303fe9dfc433f9045659218e75f4d779"
Major = 1
Minor = 11
Patch = 0
Commit = "e7d2214f303fe9dfc433f9045659218e75f4d779"
[Shim]
Type = "kataShim"
Path = "/usr/libexec/kata-containers/kata-shim"
Debug = true
[Shim.Version]
Semver = "1.11.0-alpha1-6a828a430c3d35e6ee22ee50a4fd2ed61280ad42"
Major = 1
Minor = 11
Patch = 0
Commit = "6a828a430c3d35e6ee22ee50a4fd2ed61280ad42"
[Agent]
Type = "kata"
Debug = true
Trace = false
TraceMode = ""
TraceType = ""
[Host]
Kernel = "5.0.0-1035-azure"
Architecture = "amd64"
VMContainerCapable = true
SupportVSocks = true
[Host.Distro]
Name = "Ubuntu"
Version = "18.04"
[Host.CPU]
Vendor = "GenuineIntel"
Model = "Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz"
[Netmon]
Path = "/usr/libexec/kata-containers/kata-netmon"
Debug = true
Enable = false
[Netmon.Version]
Semver = "1.11.0-alpha1"
Major = 1
Minor = 11
Patch = 0
Commit = "<
I checked with various versions all the way back to 1.9.2, it's not a recent regression.
This looks like the container is not allowed to install file capabilities. These are valid and should be installable. I would ask if the same issue happens in Podman on these platforms.
This could be an issue with the underlying file system under docker not supporting file caps.
@rhatdan The tests were done with podman
. What do you mean with "the same issue happens in Podman on these platforms"?
The command above was done with Docker. Just wanted to see if Podman worked better.
What is the underlying File system, and does it support File Capabiltiies?
I just realized that by returning to an older build, I ended up testing with 9p. Trying to re-do the series of test with virtiofs each time. Will post here.
Would there be a simple capsh
test that could allow me to check the capabilities? Asking because the dnf update takes a very long time, my network being currently extremely slow.
@rhatdan, I'm hitting this on Fedora32 using podman + virtiofsd.
fidencio@dahmer ~ $ mount | grep kata
overlay on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/rootfs type overlay (rw,nodev,relatime,context="system_u:object_r:container_file_t:s0:c601,c652",lowerdir=/var/lib/containers/storage/overlay/l/UVYA5K2GF4LJTEYYCMNE43CTUM,upperdir=/var/lib/containers/storage/overlay/c8f4f7fdcab20e7752498bd31461b017d4b7cc07d986fa232002c6764e63f361/diff,workdir=/var/lib/containers/storage/overlay/c8f4f7fdcab20e7752498bd31461b017d4b7cc07d986fa232002c6764e63f361/work,metacopy=on)
tmpfs on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d-b434a77dad6a3407-secrets type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d-c320f8d020e02bc3-resolv.conf type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d-94f0ae4cfe5884f5-hosts type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d-5307ad7128540020-hostname type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /run/kata-containers/shared/sandboxes/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d/4d17dfb9815b913e38e6f134137c84de62b8dc9b1d126092756deadf56a8f55d-936e39ad8099de96-.containerenv type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
I'm using EXT4 as the host filesystem (default Fedora).
Switching from kata-runtime
to crun
or runc
it just works.
Running with -o xattr on virtiofsd has a better chance of making this work - it will then enable the getxattr/setxattr/listxattr fuse calls. I think the 'getcap' command is a simple test. It sounds like we just need to enable xattr; there's then a separate question about whether the selinux context we run in should allow us to set certain attributes and/or whether we should translate the names of those attributes (e.g. to stop a container setting an selinux context on a file)
I can confirm that adding virtio_fs_extra_args = ["-o", "xattr" ]
makes the problem go away.
I believe that @fidencio saw the same thing.
That is with kata-runtime-1.10.0-3.fc32.x86_64
. Testing now with 1.11.
I did see the same thing using 1.11.0-alpha1.
Apparently works with kata-runtime-1.11.0-0.alpha1.fc33.x86_64
. So not a regression.
(Edit: Ah, apparently @fidencio beat me to it)
hi, the error you're hitting is due of a lack of xattr support in 9p. If you use devicemapper or virtiofs with the xattr support it works. Please see my comment https://github.com/kata-containers/tests/issues/2358#issuecomment-606565991 . It's not a problem either in docker or podman. It is a limitation in qemu.
The ability to change this kind of file attributes has security implications. In this case, it is useful because you want to set a file capability that allows the corresponding executable to perform a setgid
operation. It's normal for dnf
to do that, but allowing it in general is neither necessary nor harmless.
In theory, for example, this could be used by malicious code to grant special capabilities to a file that, ultimately, resides on the host. This could be used as a vector for an attack on the host.
In addition, I have a gut feeling that the operation being performed is not typical of normal container use. Many containers only need to execute whatever binaries are in their image, not to update or modify them. There are exceptions, of course, like builds, but they are IMO just that, exceptions.
This means that:
-o xattr
install
option associated to specific volume mounts, that says this is intented to allow me to install stuff on this volume, and behaves correctly whether the underlying FS is virtio-fs or 9p.See also #2595 for a similar issue with locking.
@c3d thanks for explain your concerns, trying to exend your idea about a "semantic install" do you think possible something like kata configuration
[hypervisor]
shared_fs_privileged_ops = <true|false>
This is still global but for users want to maximize security for fs this should be the way to go.
This of course will be a global config that could be then converted in a docker-runtime option or a runtime class in case of k8s.
Saying that, do we have the list of virtiofsd options that maximize file operations functionality but increase the security risk?
The ability to change this kind of file attributes has security implications. In this case, it is useful because you want to set a file capability that allows the corresponding executable to perform a
setgid
operation. It's normal fordnf
to do that, but allowing it in general is neither necessary nor harmless.
@c3d Thanks for pointing this out. But will this not be true in case of runc as well?
- We should offer an easy way to add a "semantic"
install
option associated to specific volume mounts, We would probably need to do this for all container rootfs since the updates are typically performed there.
@jcvenegas Having a global option seems like a good idea, though I see that more as an addition than a replacement for a per-volume option. I agree with the benefits you listed.
If you consider a workflow where your "build" step requires the privilege but your "test" step does not want then, you cannot do that with a global setting.
@amshinde Still trying to evaluate what is true with runc. There are definitely issues that are introduced by the shared fs.
@jcvenegas Sorry, I forgot to answer the second question. The one option being considered at the moment is -o xattr
. I have not investigated to list all options that could have a similar effect. @dagrh do you have a list somewhere?
For RHEL 8, I see a failure trying to perform a dnf -y update
in a Fedora 32 container
12:45:24 [Serial Test] package manager update test check dnf update
12:45:24 should not fail
12:45:24 /tmp/jenkins/workspace/kata-containers-tests-rhel-8-q35-PR/go/src/github.com/kata-containers/tests/integration/docker/package_manager_test.go:72
12:45:24 Running command '/usr/bin/docker [docker run --cidfile /tmp/cid092564149/kFaclLSQDChOYVSzCMhwV32fOl8SC1 --runtime kata-runtime -td --name kFaclLSQDChOYVSzCMhwV32fOl8SC1 fedora:32 sh]'
12:45:34 [docker run --cidfile /tmp/cid092564149/kFaclLSQDChOYVSzCMhwV32fOl8SC1 --runtime kata-runtime -td --name kFaclLSQDChOYVSzCMhwV32fOl8SC1 fedora:32 sh]
12:45:34 Timeout: 120 seconds
12:45:34 Exit Code: 0
12:45:34 Stdout: 4d5028249d6ace57e161276c780eff41bb76a6a67ac5490144cd93deb0297cb0
12:45:34
12:45:34 Stderr: Unable to find image 'fedora:32' locally
12:45:34 32: Pulling from library/fedora
12:45:34 0169c1449c16: Pulling fs layer
12:45:34 0169c1449c16: Verifying Checksum
12:45:34 0169c1449c16: Download complete
12:45:34 0169c1449c16: Pull complete
12:45:34 Digest: sha256:e69b5a62ce23c673885bddc94e6679c9b2af683059637ceddb9cff458537a326
12:45:34 Status: Downloaded newer image for fedora:32
12:45:34
12:45:34 Running command '/usr/bin/docker [docker exec kFaclLSQDChOYVSzCMhwV32fOl8SC1 dnf -y update]'
12:46:06 command failed error 'exit status 1'
12:46:06 [docker exec kFaclLSQDChOYVSzCMhwV32fOl8SC1 dnf -y update]
12:46:06 Timeout: 900 seconds
12:46:06 Exit Code: 1
12:46:06 Stdout: Fedora 32 openh264 (From Cisco) - x86_64 0.0 B/s | 0 B 00:15
12:46:06 Fedora Modular 32 - x86_64 0.0 B/s | 0 B 00:15
12:46:06
12:46:06 Stderr: Errors during downloading metadata for repository 'fedora-cisco-openh264':
12:46:06 - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-32&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]
12:46:06 Error: Failed to download metadata for repo 'fedora-cisco-openh264': Cannot prepare internal mirrorlist: ftruncate() failed: No such file or directory
12:46:06 Errors during downloading metadata for repository 'fedora-modular':
The ability to change this kind of file attributes has security implications. In this case, it is useful because you want to set a file capability that allows the corresponding executable to perform a
setgid
operation. It's normal fordnf
to do that, but allowing it in general is neither necessary nor harmless.In theory, for example, this could be used by malicious code to grant special capabilities to a file that, ultimately, resides on the host. This could be used as a vector for an attack on the host.
In addition, I have a gut feeling that the operation being performed is not typical of normal container use. Many containers only need to execute whatever binaries are in their image, not to update or modify them. There are exceptions, of course, like builds, but they are IMO just that, exceptions.
This means that:
- We should probably not make a change in the default virtiofs options to add
-o xattr
- We should offer an easy way to add a "semantic"
install
option associated to specific volume mounts, that says this is intented to allow me to install stuff on this volume, and behaves correctly whether the underlying FS is virtio-fs or 9p.See also #2595 for a similar issue with locking.
An unprivliged user exploiting capabilities setup by guest user is just one example. What about setuid root binary guest root can drop and if unpriviliged user can get access to it, it can become root on host.
So to me we need to make sure shared directories are hidden from unpriviliged users and only root should be able to have access to it. If we can do that, there is no good reason to use xattrmap in my opinion.
It might have some performance cost. So it might be a good idea to quantify the cost before using it. I would use xattrmap only when there is a need and tighten rest of the code to make sure non-root users can't access shared directory on host.
This issue is being automatically closed as Kata Containers 1.x has now reached EOL (End of Life). This means it is no longer being maintained.
Important:
All users should switch to the latest Kata Containers 2.x release to ensure they are using a maintained release that contains the latest security fixes, performance improvements and new features.
This decision was discussed by the @kata-containers/architecture-committee and has been announced via the Kata Containers mailing list:
If you believe this issue still applies to Kata Containers 2.x, please open an issue against the Kata Containers 2.x repository, pointing to this one, providing details to allow us to migrate it.
While using kata-runtime, we have seen failures in trying to perform an update in Fedora 31
The same error is happening if we choose Fedora 30. Now this is independent of the host where we run the container as this is reproducible in Ubuntu, openSUSE, debian, etc.
Now I believe this is a kata-runtime error related as I ran the same test with runc and no issues where present (the update was completed succesfully).