kata-containers / runtime

Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
https://katacontainers.io/
Apache License 2.0
2.1k stars 375 forks source link

vm templating doesn't work #1135

Closed mehrdadrad closed 5 years ago

mehrdadrad commented 5 years ago

Description of problem

I use kata w/ docker on a Ubuntu 18.04.1 LTS and once I enabled the vm template through enable_template = true, then it couldn't run a container and showed the below error: docker: Error response from daemon: OCI runtime create failed: Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing: unknown.

It works once it disabled

Expected result

run a container

Actual result

docker run -d redis 5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa docker: Error response from daemon: OCI runtime create failed: Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing: unknown.

Meta details

Running kata-collect-data.sh version 1.4.2 (commit a129b48) at 2019-01-16.20:16:37.563939341+0000.


Runtime is /usr/bin/kata-runtime.

kata-env

Output of "/usr/bin/kata-runtime kata-env":

[Meta]
  Version = "1.0.19"

[Runtime]
  Debug = false
  DisableNewNetNs = false
  Path = "/usr/bin/kata-runtime"
  [Runtime.Version]
    Semver = "1.4.2"
    Commit = "a129b48"
    OCI = "1.0.1-dev"
  [Runtime.Config]
    Path = "/usr/share/defaults/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.11.0\nCopyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  EntropySource = "/dev/urandom"
  Msize9p = 8192
  MemorySlots = 10
  Debug = false
  UseVSock = false

[Image]
  Path = "/usr/share/kata-containers/kata-containers-image_clearlinux_1.4.2_agent_b5efce24832.img"

[Kernel]
  Path = "/usr/share/kata-containers/vmlinuz-4.14.67.22-6.container"
  Parameters = ""

[Initrd]
  Path = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 1.4.2-a80aac2"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 1.4.2-8830bc4"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"

[Host]
  Kernel = "4.15.0-1021-aws"
  Architecture = "amd64"
  VMContainerCapable = true
  SupportVSocks = false
  [Host.Distro]
    Name = "Ubuntu"
    Version = "18.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz"

[Netmon]
  Version = "kata-netmon version 1.4.2"
  Path = "/usr/libexec/kata-containers/kata-netmon"
  Debug = false
  Enable = false

Runtime config files

Runtime default config files

/etc/kata-containers/configuration.toml
/usr/share/defaults/kata-containers/configuration.toml

Runtime config file contents

Config file /etc/kata-containers/configuration.toml not found Output of "cat "/usr/share/defaults/kata-containers/configuration.toml"":

# Copyright (c) 2017-2018 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per SB/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 1

# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = 0

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for SB/VM.
# If unspecified then it will be set 2048 MiB.
default_memory = 2048
#
# Default memory slots per SB/VM.
# If unspecified then it will be set 10.
# This is will determine the times that memory will be hotadded to sandbox/VM.
#memory_slots = 10

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's 
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons. 
# This flag prevents the block device from being passed to the hypervisor, 
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is either virtio-scsi or 
# virtio-blk.
block_device_driver = "virtio-scsi"

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically 
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
# 
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
# 
#disable_nesting_checks = true

# This is the msize used for 9p shares. It is the number of bytes 
# used for 9p packet payload.
#msize_9p = 8192

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent and no proxy is started, otherwise use unix
# sockets and start a proxy to communicate with the agent.
# Default false
#use_vsock = true

# VFIO devices are hotplugged on a bridge by default. 
# Enable hotplugging on root bus. This may be required for devices with
# a large PCI bar, as this is a current limitation with hotplugging on 
# a bridge. This value is valid for "pc" machine type.
# Default false
#hotplug_vfio_on_root_bus = true

# If host doesn't support vhost_net, set to true. Thus we won't create vhost fds for nics.
# Default false
#disable_vhost_net = true
#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy.  If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
#entropy_source= "/dev/urandom"

# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,postart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered will scanning for hooks,
# but it will not abort container execution.
#guest_hook_path = "/usr/share/oci/hooks"

[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Default false
enable_template = true

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[agent.kata]
# There is no field for this section. The goal is only to be able to
# specify which type of agent the user wants to use.

[netmon]
# If enabled, the network monitoring process gets started when the
# sandbox is created. This allows for the detection of some additional
# network being added to the existing network namespace, after the
# sandbox has been created.
# (default: disabled)
#enable_netmon = true

# Specify the path to the netmon binary.
path = "/usr/libexec/kata-containers/kata-netmon"

# If enabled, netmon messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - bridged
#     Uses a linux bridge to interconnect the container interface to
#     the VM. Works for most cases except macvlan and ipvlan.
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
#
#   - none
#     Used when customize network. Only creates a tap device. No veth pair.
#
#   - tcfilter
#     Uses tc filter rules to redirect traffic from the network interface
#     provided by plugin to a tap interface connected to the VM.
#
internetworking_model="macvtap"

# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
#enable_tracing = true

# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `enable_netmon`
# `disable_new_netns` conflicts with `internetworking_model=bridged` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# If you are using docker, `disable_new_netns` only works with `docker run --net=none`
# (default: false)
#disable_new_netns = true

KSM throttler

version

Output of "/usr/libexec/kata-ksm-throttler/kata-ksm-throttler --version":

kata-ksm-throttler version 1.5.0-rc2-d70af5c

systemd service

Image details

---
osbuilder:
  url: "https://github.com/kata-containers/osbuilder"
  version: "unknown"
rootfs-creation-time: "2019-01-10T08:34:25.003041843+0000Z"
description: "osbuilder rootfs"
file-format-version: "0.0.2"
architecture: "x86_64"
base-distro:
  name: "Clear"
  version: "27150"
  packages:
    default:
      - "iptables-bin"
      - "libudev0-shim"
      - "systemd"
    extra:

agent:
  url: "https://github.com/kata-containers/agent"
  name: "kata-agent"
  version: "1.4.2-b5efce24832ef9c36f41bc64856b76c004525005"
  agent-is-init-daemon: "no"

Initrd details

No initrd


Logfiles

Runtime logs

Recent runtime problems found in system journal:

time="2019-01-16T19:04:33.721235502Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=5cf83d880b699660fbbc4e2c65bda5cecbf98f65b198ba7aff003a943f356aff error="open /run/vc/sbs/5cf83d880b699660fbbc4e2c65bda5cecbf98f65b198ba7aff003a943f356aff/devices.json: no such file or directory" name=kata-runtime pid=8019 sandbox=5cf83d880b699660fbbc4e2c65bda5cecbf98f65b198ba7aff003a943f356aff sandboxid=5cf83d880b699660fbbc4e2c65bda5cecbf98f65b198ba7aff003a943f356aff source=virtcontainers subsystem=sandbox
time="2019-01-16T19:04:48.319567418Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=92e9c2192e359f4920b737f1c858b60d7a01158cab538d2c2b01c92917c8827a error="open /run/vc/sbs/92e9c2192e359f4920b737f1c858b60d7a01158cab538d2c2b01c92917c8827a/devices.json: no such file or directory" name=kata-runtime pid=8212 sandbox=92e9c2192e359f4920b737f1c858b60d7a01158cab538d2c2b01c92917c8827a sandboxid=92e9c2192e359f4920b737f1c858b60d7a01158cab538d2c2b01c92917c8827a source=virtcontainers subsystem=sandbox
time="2019-01-16T19:23:29.70512081Z" level=warning msg="load vm factory failed, about to create new one" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="stat /run/vc/vm/template/memory: no such file or directory" name=kata-runtime pid=8490 source=runtime
time="2019-01-16T19:23:29.861347204Z" level=error msg="trace called before context set" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f name=kata-runtime pid=8490 source=virtcontainers subsystem=kata_agent type=bug
time="2019-01-16T19:23:49.862445999Z" level=info msg="clean up proxy" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8490 source=virtcontainers vm=983c3447-7709-4dde-a97e-453fc0799b40
time="2019-01-16T19:23:49.862744415Z" level=error msg="Failed to read agent logs" arch=amd64 command=create console-protocol=unix console-socket=/run/vc/vm/983c3447-7709-4dde-a97e-453fc0799b40/console.sock container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="read unix @->/run/vc/vm/983c3447-7709-4dde-a97e-453fc0799b40/console.sock: use of closed network connection" name=kata-runtime pid=8490 source=virtcontainers vm=983c3447-7709-4dde-a97e-453fc0799b40
time="2019-01-16T19:23:49.862969325Z" level=info msg="clean up vm" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8490 source=virtcontainers vm=983c3447-7709-4dde-a97e-453fc0799b40
time="2019-01-16T19:23:49.866550617Z" level=error msg="failed to create new vm" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8490 source=virtcontainers vm=983c3447-7709-4dde-a97e-453fc0799b40
time="2019-01-16T19:23:50.087362496Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="open /run/vc/sbs/7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f/devices.json: no such file or directory" name=kata-runtime pid=8490 sandbox=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f sandboxid=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f source=virtcontainers subsystem=sandbox
time="2019-01-16T19:23:50.088091304Z" level=info msg="fallback to direct factory vm" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="hypervisor config does not match, base: {HypervisorType:qemu HypervisorConfig:{NumVCPUs:0 DefaultMaxVCPUs:72 MemorySize:0 DefaultBridges:1 Msize9p:8192 MemSlots:10 KernelParams:[{Key:init Value:/usr/lib/systemd/systemd} {Key:systemd.unit Value:kata-containers.target} {Key:systemd.mask Value:systemd-networkd.service} {Key:systemd.mask Value:systemd-networkd.socket}] HypervisorParams:[] KernelPath:/usr/share/kata-containers/vmlinuz-4.14.67.22-6.container ImagePath:/usr/share/kata-containers/kata-containers-image_clearlinux_1.4.2_agent_b5efce24832.img InitrdPath: FirmwarePath: MachineAccelerators: HypervisorPath:/usr/bin/qemu-lite-system-x86_64 BlockDeviceDriver:virtio-scsi HypervisorMachineType:pc MemoryPath: DevicesStatePath: EntropySource:/dev/urandom customAssets:map[] DisableBlockDeviceUse:false EnableIOThreads:false Debug:false MemPrealloc:false HugePages:false Realtime:false Mlock:true DisableNestingChecks:false UseVSock:false HotplugVFIOOnRootBus:false BootToBeTemplate:false BootFromTemplate:false DisableVhostNet:false GuestHookPath:} AgentType:kata AgentConfig:{LongLiveConn:false UseVSock:false} ProxyType:noopProxy ProxyConfig:{Path: Debug:false}}. new: {HypervisorType:qemu HypervisorConfig:{NumVCPUs:0 DefaultMaxVCPUs:72 MemorySize:0 DefaultBridges:1 Msize9p:8192 MemSlots:10 KernelParams:[] HypervisorParams:[] KernelPath:/usr/share/kata-containers/vmlinuz-4.14.67.22-6.container ImagePath:/usr/share/kata-containers/kata-containers-image_clearlinux_1.4.2_agent_b5efce24832.img InitrdPath: FirmwarePath: MachineAccelerators: HypervisorPath:/usr/bin/qemu-lite-system-x86_64 BlockDeviceDriver:virtio-scsi HypervisorMachineType:pc MemoryPath: DevicesStatePath: EntropySource:/dev/urandom customAssets:map[] DisableBlockDeviceUse:false EnableIOThreads:false Debug:false MemPrealloc:false HugePages:false Realtime:false Mlock:true DisableNestingChecks:false UseVSock:false HotplugVFIOOnRootBus:false BootToBeTemplate:false BootFromTemplate:false DisableVhostNet:false GuestHookPath:} AgentType:kata AgentConfig:{LongLiveConn:false UseVSock:false} ProxyType:noopProxy ProxyConfig:{Path: Debug:false}}" name=kata-runtime pid=8490 source=virtcontainers subsystem=factory
time="2019-01-16T19:23:50.228966846Z" level=error msg="trace called before context set" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f name=kata-runtime pid=8490 source=virtcontainers subsystem=kata_agent type=bug
time="2019-01-16T19:24:11.112499563Z" level=error msg="Error detach virtual ep" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f error="exitting QMP loop, command cancelled" name=kata-runtime pid=8490 source=virtcontainers subsystem=network
time="2019-01-16T19:24:11.112623517Z" level=error msg="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" arch=amd64 command=create container=7ad435159b3a406e7484226a0650e1889efc91eb8191f0dbd3b9d9e37868dc6f name=kata-runtime pid=8490 source=runtime
time="2019-01-16T19:24:39.26965874Z" level=warning msg="load vm factory failed, about to create new one" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="stat /run/vc/vm/template/memory: no such file or directory" name=kata-runtime pid=8652 source=runtime
time="2019-01-16T19:24:39.433902127Z" level=error msg="trace called before context set" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa name=kata-runtime pid=8652 source=virtcontainers subsystem=kata_agent type=bug
time="2019-01-16T19:24:59.434772815Z" level=info msg="clean up proxy" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8652 source=virtcontainers vm=a45abb2e-3937-49c1-987d-00fe5687be17
time="2019-01-16T19:24:59.43509943Z" level=error msg="Failed to read agent logs" arch=amd64 command=create console-protocol=unix console-socket=/run/vc/vm/a45abb2e-3937-49c1-987d-00fe5687be17/console.sock container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="read unix @->/run/vc/vm/a45abb2e-3937-49c1-987d-00fe5687be17/console.sock: use of closed network connection" name=kata-runtime pid=8652 source=virtcontainers vm=a45abb2e-3937-49c1-987d-00fe5687be17
time="2019-01-16T19:24:59.435198141Z" level=info msg="clean up vm" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8652 source=virtcontainers vm=a45abb2e-3937-49c1-987d-00fe5687be17
time="2019-01-16T19:24:59.438533206Z" level=error msg="failed to create new vm" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" name=kata-runtime pid=8652 source=virtcontainers vm=a45abb2e-3937-49c1-987d-00fe5687be17
time="2019-01-16T19:24:59.651408882Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="open /run/vc/sbs/5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa/devices.json: no such file or directory" name=kata-runtime pid=8652 sandbox=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa sandboxid=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa source=virtcontainers subsystem=sandbox
time="2019-01-16T19:24:59.652106324Z" level=info msg="fallback to direct factory vm" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="hypervisor config does not match, base: {HypervisorType:qemu HypervisorConfig:{NumVCPUs:0 DefaultMaxVCPUs:72 MemorySize:0 DefaultBridges:1 Msize9p:8192 MemSlots:10 KernelParams:[{Key:init Value:/usr/lib/systemd/systemd} {Key:systemd.unit Value:kata-containers.target} {Key:systemd.mask Value:systemd-networkd.service} {Key:systemd.mask Value:systemd-networkd.socket}] HypervisorParams:[] KernelPath:/usr/share/kata-containers/vmlinuz-4.14.67.22-6.container ImagePath:/usr/share/kata-containers/kata-containers-image_clearlinux_1.4.2_agent_b5efce24832.img InitrdPath: FirmwarePath: MachineAccelerators: HypervisorPath:/usr/bin/qemu-lite-system-x86_64 BlockDeviceDriver:virtio-scsi HypervisorMachineType:pc MemoryPath: DevicesStatePath: EntropySource:/dev/urandom customAssets:map[] DisableBlockDeviceUse:false EnableIOThreads:false Debug:false MemPrealloc:false HugePages:false Realtime:false Mlock:true DisableNestingChecks:false UseVSock:false HotplugVFIOOnRootBus:false BootToBeTemplate:false BootFromTemplate:false DisableVhostNet:false GuestHookPath:} AgentType:kata AgentConfig:{LongLiveConn:false UseVSock:false} ProxyType:noopProxy ProxyConfig:{Path: Debug:false}}. new: {HypervisorType:qemu HypervisorConfig:{NumVCPUs:0 DefaultMaxVCPUs:72 MemorySize:0 DefaultBridges:1 Msize9p:8192 MemSlots:10 KernelParams:[] HypervisorParams:[] KernelPath:/usr/share/kata-containers/vmlinuz-4.14.67.22-6.container ImagePath:/usr/share/kata-containers/kata-containers-image_clearlinux_1.4.2_agent_b5efce24832.img InitrdPath: FirmwarePath: MachineAccelerators: HypervisorPath:/usr/bin/qemu-lite-system-x86_64 BlockDeviceDriver:virtio-scsi HypervisorMachineType:pc MemoryPath: DevicesStatePath: EntropySource:/dev/urandom customAssets:map[] DisableBlockDeviceUse:false EnableIOThreads:false Debug:false MemPrealloc:false HugePages:false Realtime:false Mlock:true DisableNestingChecks:false UseVSock:false HotplugVFIOOnRootBus:false BootToBeTemplate:false BootFromTemplate:false DisableVhostNet:false GuestHookPath:} AgentType:kata AgentConfig:{LongLiveConn:false UseVSock:false} ProxyType:noopProxy ProxyConfig:{Path: Debug:false}}" name=kata-runtime pid=8652 source=virtcontainers subsystem=factory
time="2019-01-16T19:24:59.793594907Z" level=error msg="trace called before context set" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa name=kata-runtime pid=8652 source=virtcontainers subsystem=kata_agent type=bug
time="2019-01-16T19:25:20.680464622Z" level=error msg="Error detach virtual ep" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa error="exitting QMP loop, command cancelled" name=kata-runtime pid=8652 source=virtcontainers subsystem=network
time="2019-01-16T19:25:20.681170771Z" level=error msg="Failed to check if grpc server is working: rpc error: code = Unavailable desc = transport is closing" arch=amd64 command=create container=5ceac679871425e22543ab4ac1e3310ca4ffcd9f12fe2874eb4185031413bcaa name=kata-runtime pid=8652 source=runtime
time="2019-01-16T19:29:23.759401259Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=6fa610e33b612eaed0b53b38c4006958f2c7ad2b69eb84739e9a8e639e15d0b2 error="open /run/vc/sbs/6fa610e33b612eaed0b53b38c4006958f2c7ad2b69eb84739e9a8e639e15d0b2/devices.json: no such file or directory" name=kata-runtime pid=8789 sandbox=6fa610e33b612eaed0b53b38c4006958f2c7ad2b69eb84739e9a8e639e15d0b2 sandboxid=6fa610e33b612eaed0b53b38c4006958f2c7ad2b69eb84739e9a8e639e15d0b2 source=virtcontainers subsystem=sandbox

Proxy logs

Recent proxy problems found in system journal:

time="2019-01-16T19:24:11.079585945Z" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/5e4cf715-104f-4866-ace5-39d7d46e9c29/kata.sock: use of closed network connection" name=kata-proxy pid=8562 sandbox=5e4cf715-104f-4866-ace5-39d7d46e9c29 source=proxy
time="2019-01-16T19:25:20.646019074Z" level=fatal msg="failed to handle exit signal" error="close unix @->/run/vc/vm/f7f6ce5d-13c9-402f-8341-a19510b4e935/kata.sock: use of closed network connection" name=kata-proxy pid=8722 sandbox=f7f6ce5d-13c9-402f-8341-a19510b4e935 source=proxy

Shim logs

No recent shim problems found in system journal.

Throttler logs

No recent throttler problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker version":

Client:
 Version:           18.09.1
 API version:       1.39
 Go version:        go1.10.6
 Git commit:        4c52b90
 Built:             Wed Jan  9 19:35:31 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.1
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       4c52b90
  Built:            Wed Jan  9 19:02:44 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of "docker info":

Containers: 5
 Running: 3
 Paused: 0
 Stopped: 2
Images: 1
Server Version: 18.09.1
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: kata-runtime runc
Default Runtime: kata-runtime
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: a129b48
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-1021-aws
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 72
Total Memory: 503.8GiB
Name: ip-10-0-0-97
ID: 5KH4:UAFW:W44R:OAI7:IU6P:2SHW:X2MS:4XXA:XNWR:3MZZ:OVEU:UFER
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Output of "systemctl show docker":

Type=notify
Restart=always
NotifyAccess=main
RestartUSec=2s
TimeoutStartUSec=infinity
TimeoutStopUSec=infinity
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestamp=Wed 2019-01-16 18:58:43 UTC
WatchdogTimestampMonotonic=617366324
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=7340
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
UID=[not set]
GID=[not set]
NRestarts=0
ExecMainStartTimestamp=Wed 2019-01-16 18:58:43 UTC
ExecMainStartTimestampMonotonic=616840270
ExecMainExitTimestampMonotonic=0
ExecMainPID=7340
ExecMainCode=0
ExecMainStatus=0
ExecStart={ path=/usr/bin/dockerd ; argv[]=/usr/bin/dockerd -H fd:// ; ignore_errors=no ; start_time=[Wed 2019-01-16 18:58:43 UTC] ; stop_time=[n/a] ; pid=7340 ; code=(null) ; status=0/0 }
ExecReload={ path=/bin/kill ; argv[]=/bin/kill -s HUP $MAINPID ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/docker.service
MemoryCurrent=[not set]
CPUUsageNSec=[not set]
TasksCurrent=82
IPIngressBytes=18446744073709551615
IPIngressPackets=18446744073709551615
IPEgressBytes=18446744073709551615
IPEgressPackets=18446744073709551615
Delegate=yes
DelegateControllers=cpu cpuacct io blkio memory devices pids
CPUAccounting=no
CPUWeight=[not set]
StartupCPUWeight=[not set]
CPUShares=[not set]
StartupCPUShares=[not set]
CPUQuotaPerSecUSec=infinity
IOAccounting=no
IOWeight=[not set]
StartupIOWeight=[not set]
BlockIOAccounting=no
BlockIOWeight=[not set]
StartupBlockIOWeight=[not set]
MemoryAccounting=no
MemoryLow=0
MemoryHigh=infinity
MemoryMax=infinity
MemorySwapMax=infinity
MemoryLimit=infinity
DevicePolicy=auto
TasksAccounting=yes
TasksMax=infinity
IPAccounting=no
UMask=0022
LimitCPU=infinity
LimitCPUSoft=infinity
LimitFSIZE=infinity
LimitFSIZESoft=infinity
LimitDATA=infinity
LimitDATASoft=infinity
LimitSTACK=infinity
LimitSTACKSoft=8388608
LimitCORE=infinity
LimitCORESoft=infinity
LimitRSS=infinity
LimitRSSSoft=infinity
LimitNOFILE=infinity
LimitNOFILESoft=infinity
LimitAS=infinity
LimitASSoft=infinity
LimitNPROC=infinity
LimitNPROCSoft=infinity
LimitMEMLOCK=16777216
LimitMEMLOCKSoft=16777216
LimitLOCKS=infinity
LimitLOCKSSoft=infinity
LimitSIGPENDING=2063376
LimitSIGPENDINGSoft=2063376
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=infinity
LimitRTTIMESoft=infinity
OOMScoreAdjust=0
Nice=0
IOSchedulingClass=0
IOSchedulingPriority=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardInputData=
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
LogLevelMax=-1
SecureBits=0
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend
AmbientCapabilities=
DynamicUser=no
RemoveIPC=no
MountFlags=
PrivateTmp=no
PrivateDevices=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
LockPersonality=no
RuntimeDirectoryPreserve=no
RuntimeDirectoryMode=0755
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=0755
ConfigurationDirectoryMode=0755
MemoryDenyWriteExecute=no
RestrictRealtime=no
RestrictNamespaces=no
MountAPIVFS=no
KeyringMode=private
KillMode=process
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=docker.service
Names=docker.service
Requires=docker.socket system.slice sysinit.target
Wants=network-online.target
BindsTo=containerd.service
WantedBy=multi-user.target
ConsistsOf=docker.socket
Conflicts=shutdown.target
Before=shutdown.target multi-user.target
After=network-online.target sysinit.target systemd-journald.socket docker.socket firewalld.service system.slice basic.target
TriggeredBy=docker.socket
Documentation=https://docs.docker.com
Description=Docker Application Container Engine
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/lib/systemd/system/docker.service
UnitFileState=enabled
UnitFilePreset=enabled
StateChangeTimestamp=Wed 2019-01-16 18:58:43 UTC
StateChangeTimestampMonotonic=617366329
InactiveExitTimestamp=Wed 2019-01-16 18:58:43 UTC
InactiveExitTimestampMonotonic=616840337
ActiveEnterTimestamp=Wed 2019-01-16 18:58:43 UTC
ActiveEnterTimestampMonotonic=617366329
ActiveExitTimestamp=Wed 2019-01-16 18:58:43 UTC
ActiveExitTimestampMonotonic=616821889
InactiveEnterTimestamp=Wed 2019-01-16 18:58:43 UTC
InactiveEnterTimestampMonotonic=616830421
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Wed 2019-01-16 18:58:43 UTC
ConditionTimestampMonotonic=616838411
AssertTimestamp=Wed 2019-01-16 18:58:43 UTC
AssertTimestampMonotonic=616838412
Transient=no
Perpetual=no
StartLimitIntervalUSec=1min
StartLimitBurst=3
StartLimitAction=none
FailureAction=none
SuccessAction=none
InvocationID=0a6c59c63cee4d3a8f7521c7e5eb5f86
CollectMode=inactive

No kubectl


Packages

Have dpkg Output of "dpkg -l|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"":

ii  kata-containers-image          1.4.2-5                           amd64        Kata containers image
ii  kata-ksm-throttler             1.4.2.git+d70af5c-5               amd64        
ii  kata-linux-container           4.14.67.22-6                      amd64        linux kernel optimised for container-like workloads.
ii  kata-proxy                     1.4.2+git.a80aac2-5               amd64        
ii  kata-runtime                   1.4.2+git.a129b48-6               amd64        
ii  kata-shim                      1.4.2+git.8830bc4-5               amd64        
ii  qemu-lite                      2.11.0+git.87517afd72-5           amd64        linux kernel optimised for container-like workloads.
ii  qemu-vanilla                   2.11.2+git.0982a56a55-5           amd64        linux kernel optimised for container-like workloads.

No rpm



mehrdadrad commented 5 years ago

I just found the issue. once we install kata through the automatic or manual (not compile) the configuration.toml doesn't have initrd and it needs to have it instead of image = "...":

initrd = "/usr/share/kata-containers/kata-containers-initrd.img"

then it works properly.

mehrdadrad commented 5 years ago

two questions related to template:

bergwolf commented 5 years ago

@mehrdadrad Yes, vm templating requires initrd to work. And to answer your two questions, you can use kata-runtime factory init to create and destroy the vm template if you do not want it to interfere with your first time container creation. And kata-runtime factory destroy purges the template image from /run/vc/vm/template.

mehrdadrad commented 5 years ago

thanks @bergwolf init and destroy work for me!

bergwolf commented 5 years ago

cool. @mehrdadrad do you have more questions/issue w.r.t. vm templating? If not, please close the issue. Thanks!

mehrdadrad commented 5 years ago

Thanks @bergwolf