kata-containers / runtime

Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
https://katacontainers.io/
Apache License 2.0
2.1k stars 375 forks source link

vc: incorrect handling of mounts to /dev in containers #471

Closed greg-jablonski closed 3 years ago

greg-jablonski commented 6 years ago

Description of problem

When creating a container with a mount whose target is under /dev in the container, creating the container fails if the target path on the container exists on the host (under some circumstances, at least)

E.g.

Create a pod:

bash-4.2# crictl runp ./example_pod_config.json 
e07d3ee486c3fe62d63fe99f8779923813880e689db2fdf70794200ec8c19cd9

Note the mount to /dev/donkeys in container definition example_container_config.json

bash-4.2# cat example_container_config.json 
{
...<snip>...
        "mounts": [ 
          { 
             "container_path": "/dev/donkeys",
             "host_path": "/var/donkeys"
          }
        ],
...<snip>...
}

Make sure the host-side source path exists.

bash-4.2# mkdir /var/donkeys

Create and run the container and use the mount successfully.

bash-4.2# crictl create e07d3ee486c3fe62d63fe99f8779923813880e689db2fdf70794200ec8c19cd9 example_container_config.json example_pod_config.json 
7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
bash-4.2# crictl start 7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
sh-4.2# crictl exec -i -t 7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4 /bin/bash
root@ci-test:/# echo "one" > /dev/donkeys/one
echo "one" > /dev/donkeys/one
root@ci-test:/# exit
exit
exit
bash-4.2# cat /var/donkeys/one 
one

Clean that container up

bash-4.2# crictl stop 7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
bash-4.2# crictl rm 7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4
7338edd3349ab223ec805b9d8f2e6a0c0d904972e7f99653d487c241280d87e4

Now create /dev/donkeys on the host and try again.

bash-4.2# mkdir /dev/donkeys
bash-4.2# crictl create e07d3ee486c3fe62d63fe99f8779923813880e689db2fdf70794200ec8c19cd9 example_container_config.json example_pod_config.json 
FATA[0000] Creating container failed: rpc error: code = Unknown desc = container create failed: rpc error: code = Internal desc = Could not run process: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:58: mounting \\\"/var/donkeys\\\" to rootfs \\\"/run/kata-containers/shared/containers/cb67a5b06424a5776b9cb264db64ea98af3b0358f356a1bc70f8fd08dff35371/rootfs\\\" at \\\"/dev/donkeys\\\" caused \\\"stat /var/donkeys: no such file or directory\\\"\""

Expected result

The container with the mount from /var/donkeys on the host to /dev/donkeys in the container should be created successfully regardless of the existence of /dev/donkeys on the host.

Actual result

The container was not created, with this error:

FATA[0000] Creating container failed: rpc error: code = Unknown desc = container create failed: rpc error: code = Internal desc = Could not run process: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:58: mounting \\\"/var/donkeys\\\" to rootfs \\\"/run/kata-containers/shared/containers/cb67a5b06424a5776b9cb264db64ea98af3b0358f356a1bc70f8fd08dff35371/rootfs\\\" at \\\"/dev/donkeys\\\" caused \\\"stat /var/donkeys: no such file or directory\\\"\""

Additional info

I think the runtime / virtcontainers is incorrectly not rewriting the source path from the host's path to the shared path. I think this is because it is incorrectly testing the "device"-ness of the source path on the host based on the destination path inside the container (m.Destination), rather than the source path on the host itself (m.Source) here-ish: https://github.com/kata-containers/runtime/blob/master/virtcontainers/container.go#L463

I think this is an easy fix. For now, I am waiting to hear back from my employer about getting approval for oss contributions for stuffs done at work before doing a PR.


Meta details

Running kata-collect-data.sh version 1.0.0 (commit 329f70463a0a6a5c02da61801987ca4247094424-dirty) at 2018-07-06.01:58:15.373282134+0000.


Runtime is /usr/local/bin/kata-runtime.

kata-env

Output of "/usr/local/bin/kata-runtime kata-env":

[Meta]
  Version = "1.0.12"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "1.0.0"
    Commit = "329f70463a0a6a5c02da61801987ca4247094424-dirty"
    OCI = "1.0.1"
  [Runtime.Config]
    Path = "/etc/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "<<unknown>>"
  Path = "/usr/local/bin/hack-qemu.sh"
  BlockDeviceDriver = "virtio-scsi"
  Msize9p = 8192
  Debug = false

[Image]
  Path = "/usr/share/clear-containers/kata-containers-1.0.0+git.e7d934d.201806040308.img"

[Kernel]
  Path = "/usr/share/clear-containers/vmlinuz-4.14.14-4.container"
  Parameters = ""

[Initrd]
  Path = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 1.0.0-a69326b63802952b14203ea9c1533d4edb8c1d64"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 1.0.0-74cbc1ee7645916a994b767790da4c6116d28270"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"

[Host]
  Kernel = "4.1.12-124.14.5.el7uek.x86_64"
  Architecture = "amd64"
  VMContainerCapable = true
  [Host.Distro]
    Name = "Oracle Linux Server"
    Version = "7.5"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz"

Runtime config files

Runtime default config files

/etc/kata-containers/configuration.toml
/usr/share/defaults/kata-containers/configuration.toml

Runtime config file contents

Output of "cat "/etc/kata-containers/configuration.toml"":

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "config/configuration.toml.in"
# XXX: Project:
# XXX:   Name: Intel® Clear Containers
# XXX:   Type: cc

[hypervisor.qemu]
path = "/usr/local/bin/hack-qemu.sh"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per POD/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 1

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per POD/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for POD/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's 
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons. 
# This flag prevents the block device from being passed to the hypervisor, 
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is either virtio-scsi or 
# virtio-blk.
block_device_driver = "virtio-scsi"

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically 
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
# 
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
# 
#disable_nesting_checks = true

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[agent.kata]
# There is no field for this section. The goal is only to be able to
# specify which type of agent the user wants to use.

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - bridged
#     Uses a linux bridge to interconnect the container interface to
#     the VM. Works for most cases except macvlan and ipvlan.
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
internetworking_model="macvtap"

Config file /usr/share/defaults/kata-containers/configuration.toml not found


Image details

---
osbuilder:
  url: "https://github.com/kata-containers/osbuilder"
  version: "0.0.1-2569cfa34c84e32600b93dde796f08e26006b1b8-dirty"
rootfs-creation-time: "2018-06-04T03:11:17.303372801-0700Z"
description: "osbuilder rootfs"
file-format-version: "0.0.2"
architecture: "x86_64"
base-distro:
  name: "OLTST"
  version: "7"
  packages:
    default:
      - "iptables"
      - "systemd"
    extra:

agent:
  url: "https://github.com/kata-containers/agent"
  name: "kata-agent"
  version: "1.0.0-e7d934d02f1201e57dc36b00934a011f22127e66"
  agent-is-init-daemon: "no"

Initrd details

No initrd


Logfiles

Runtime logs

No recent runtime problems found in system journal.

Proxy logs

No recent proxy problems found in system journal.

Shim logs

No recent shim problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker version":

Client:
 Version:      17.06.2-ol
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   d02b7ab
 Built:        Fri Oct  6 00:02:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.2-ol
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   d02b7ab
 Built:        Fri Oct  6 00:03:48 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of "docker info":

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 141
Server Version: 17.06.2-ol
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
 selinux
Kernel Version: 4.1.12-124.14.5.el7uek.x86_64
Operating System: Oracle Linux Server 7.5
OSType: linux
Architecture: x86_64
CPUs: 104
Total Memory: 754.3GiB
Name: cix7
ID: MPGZ:ES5C:AIEA:JLQS:W42V:RZCB:QP4O:ZHJO:A262:CWU2:25W4:VI5J
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Output of "systemctl show docker":

Failed to get D-Bus connection: Operation not permitted

No kubectl


Packages

No dpkg Have rpm Output of "rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-containers-image|linux-container|qemu-)"":


sboeuf commented 6 years ago

@greg-jablonski thanks for raising this issue. This looks like a legitimate issue that we have here, and I am looking forward to your contribution through a PR solving this :)