clearcontainers / runtime

OCI (Open Containers Initiative) compatible runtime using Virtual Machines
Apache License 2.0
589 stars 70 forks source link

docker run fails with cmd id received 8 not matching expected 9 #825

Open bryteise opened 6 years ago

bryteise commented 6 years ago

Description of problem

docker fails to start container the first time it is attempted to be run

Expected result

docker run succeeds the first time

Actual result

root@clr-0552865481254b448d6c45c2a317a0c7 ~ # uname -r
4.14.1-446.native
proxyclr-0552865481254b448d6c45c2a317a0c7 ~ # systemctl enable --now docker cc3-
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /usr/lib/systemd/system/docker.service.
Created symlink /etc/systemd/system/multi-user.target.wants/cc3-proxy.service → /usr/lib/systemd/system/cc3-proxy.service.
root@clr-0552865481254b448d6c45c2a317a0c7 ~ # docker run -it clearlinux
Unable to find image 'clearlinux:latest' locally
latest: Pulling from library/clearlinux
a59415a85b4b: Pull complete
Digest: sha256:2cf6333893ae5e28ab1645d9f448430f06167c791a9f9aee5976f81c573e3791
Status: Downloaded newer image for clearlinux:latest
docker: Error response from daemon: exit status 1: "CMD ID received 8 not matching expected 9\n".

(the next docker run -it clearlinux works)


Meta details

Running cc-collect-data.sh version 3.0.8 (commit ) at 2017-11-22.00:56:52.901084570.


Runtime is /usr/bin/cc-runtime.

cc-env

Output of "/usr/bin/cc-runtime cc-env":

[Meta]                                                                                                                              
  Version = "1.0.6"                                                                                                                  

[Runtime]
  [Runtime.Version]
    Semver = "3.0.8"
    Commit = ""
    OCI = "1.0.0-dev"
  [Runtime.Config]
    Path = "/usr/share/defaults/clear-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.7.0, Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"

[Image]
  Path = "/usr/share/clear-containers/clear-19070-containers.img"

[Kernel]
  Path = "/usr/share/clear-containers/vmlinuz-4.9.60-80.container"
  Parameters = ""

[Proxy]
  Type = "ccProxy"
  Version = "Version: 3.0.8+"
  URL = "unix:///var/run/clear-containers/proxy.sock"

[Shim]
  Type = "ccShim"
  Version = "shim version: 3.0.8 (commit: 0)"
  Path = "/usr/libexec/clear-containers/cc-shim"
[Agent]
  Type = "hyperstart"
  Version = "<<unknown>>"

[Host]
  Kernel = "4.14.1-446.native"
  CCCapable = true
  [Host.Distro]
    Name = "Clear Linux OS for Intel Architecture"
    Version = "19210"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz"

Runtime config files

Runtime default config files

/etc/clear-containers/configuration.toml
/usr/share/defaults/clear-containers/configuration.toml

Runtime config file contents

Config file /etc/clear-containers/configuration.toml not found Output of "cat "/usr/share/defaults/clear-containers/configuration.toml"":

# XXX: Warning: this file is auto-generated from file "config/configuration.toml.in".

[hypervisor.qemu]
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/clear-containers/vmlinuz.container"
image = "/usr/share/clear-containers/clear-containers.img"
machine_type = "pc"
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc
kernel_params = ""

# Default number of vCPUs per POD/VM:
# unspecified or 0 --> will be set to 1
# < 0              --> will be set to the actual number of physical cores
# > 0 <= 255       --> will be set to the specified number
# > 255            --> will be set to 255
default_vcpus = -1

# Default memory size in MiB for POD/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048
disable_block_device_use = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# Debug changes the default hypervisor and kernel parameters to
# enable debug output where available.
# Default false
# these logs can be obtained in the cc-proxy logs  when the
# proxy is set to run in debug mode
# /usr/libexec/clear-containers/cc-proxy -log debug
# or by stopping the cc-proxy service and running the cc-proxy
# explicitly using the same command line
#
#enable_debug = true
                                                                                                                            [167/598]
# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true

[proxy.cc]
url = "unix:///var/run/clear-containers/proxy.sock"

[shim.cc]
path = "/usr/libexec/clear-containers/cc-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true

Logfiles

Runtime logs

Recent runtime problems found in system journal:

time="2017-11-22T00:50:44Z" level=info msg="launching qemu with: [-name pod-06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd
20b3b17 -uuid 30366261-3931-6131-3438-396262653632 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/virtc
ontainers/pods/06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b17/monitor.sock,server,nowait -qmp unix:/run/virtcontain
ers/pods/06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b17/ctrl.sock,server,nowait -m 2048M,slots=2,maxmem=4969M -smp 
1,cores=1,threads=1,sockets=1 -device virtio-9p-pci,disable-modern=true,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p-
0,path=/var/lib/docker/overlay2/e5ba8ae69c5e2c92cc686bc20105d6ec08d4069c637d126f106d1ef81e3a9f2b/merged,security_model=none -device v
irtio-serial-pci,disable-modern=true,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,
path=/run/virtcontainers/pods/06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b17/console.sock,server,nowait -device nvd
imm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19070-containers.img,size=23592
9600 -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/p
ods/06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b17/hyper.sock,server,nowait -device virtserialport,chardev=charch1,
id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/06ba91a1489bbe62d59de308dfa46706cf18adae
2f98342c8448227fd20b3b17/tty.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-hyperShared,mount_tag=hyperS
hared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b
17,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:4
2:ac:11:00:02,disable-modern=true,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -no
-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pmem
0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct
=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cryp
tomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target system
d.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227
fd20b3b17::off::]" source=virtcontainers subsystem=qmp                                                                               
time="2017-11-22T00:51:11Z" level=error msg="Failed to start container" container-id=06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c
8448227fd20b3b17 error="CMD ID received 8 not matching expected 9" pod-id=06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20
b3b17 source=virtcontainers subsystem=container                                                                                      
time="2017-11-22T00:51:11Z" level=warning msg="Failed to stop container" container-id=06ba91a1489bbe62d59de308dfa46706cf18adae2f98342
c8448227fd20b3b17 error="Container not running, impossible to stop" pod-id=06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd2
0b3b17 source=virtcontainers subsystem=container                                                                                     
time="2017-11-22T00:51:11Z" level=error msg="CMD ID received 8 not matching expected 9" source=runtime                               
time="2017-11-22T00:51:11Z" level=warning msg="Could not get veth interface %s: %seth0LinkByName() failed for veth name eth0: Link no
t found" source=virtcontainers                                                                                                       
time="2017-11-22T00:51:11Z" level=error msg="Container 06ba91a1489bbe62d59de308dfa46706cf18adae2f98342c8448227fd20b3b17 not ready or 
running, cannot send a signal" source=runtime                                                                                        
time="2017-11-22T00:51:11Z" level=error msg="Can not move from stopped to stopped" source=runtime                                    
time="2017-11-22T00:52:14Z" level=info msg="launching qemu with: [-name pod-8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812a
fd3dc36 -uuid 38626537-3934-6337-3534-646236623635 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/virtc
ontainers/pods/8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3dc36/monitor.sock,server,nowait -qmp unix:/run/virtcontain
ers/pods/8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3dc36/ctrl.sock,server,nowait -m 2048M,slots=2,maxmem=4969M -smp 
1,cores=1,threads=1,sockets=1 -device virtio-9p-pci,disable-modern=true,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p-
0,path=/var/lib/docker/overlay2/ff7e5bedb5148914147e643454f7f7e2e38b73e82d27473a2ac1ce7586b98c17/merged,security_model=none -device v
irtio-serial-pci,disable-modern=true,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,
path=/run/virtcontainers/pods/8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3dc36/console.sock,server,nowait -device nvd
imm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19070-containers.img,size=23592
9600 -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/p
ods/8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3dc36/hyper.sock,server,nowait -device virtserialport,chardev=charch1,
id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/8be794c754db6b65997dd886b3fa5d3e8c51a48$
6c900f5eed492812afd3dc36/tty.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-hyperShared,mount_tag=hyper$
hared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3d$
36,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:$
2:ac:11:00:02,disable-modern=true,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -n$
-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pme$
0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direc$
=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cry$
tomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target syste$
d.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed4928$
2afd3dc36::off::]" source=virtcontainers subsystem=qmp
time="2017-11-22T00:52:24Z" level=error msg="Container not running, impossible to signal the container" source=runtime
time="2017-11-22T00:52:24Z" level=error msg="Container ID (8be794c754db6b65997dd886b3fa5d3e8c51a4896c900f5eed492812afd3dc36) does no$
 exist" source=runtime             
time="2017-11-22T00:52:45Z" level=info msg="launching qemu with: [-name pod-da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228$
8c6e569 -uuid 64613933-3333-6662-6665-373462393262 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/virt$
ontainers/pods/da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e569/monitor.sock,server,nowait -qmp unix:/run/virtcontai$
ers/pods/da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e569/ctrl.sock,server,nowait -m 2048M,slots=2,maxmem=4969M -smp
1,cores=1,threads=1,sockets=1 -device virtio-9p-pci,disable-modern=true,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p$
0,path=/var/lib/docker/overlay2/0cd343d3c5337101876ed7e5525247e61f11d94588c3f336a8674eac1fdbff47/merged,security_model=none -device $
irtio-serial-pci,disable-modern=true,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0$
path=/run/virtcontainers/pods/da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e569/console.sock,server,nowait -device nv$
imm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19070-containers.img,size=2359$
9600 -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/$
ods/da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e569/hyper.sock,server,nowait -device virtserialport,chardev=charch1$
id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/da9333fbfe74b92b064a5f15877e32376d7e5b6$
dc1bb81558220228e8c6e569/tty.sock,server,nowait -device virtio-9p-pci,disable-modern=true,fsdev=extra-9p-hyperShared,mount_tag=hyper$
hared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e$
69,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:$
2:ac:11:00:02,disable-modern=true,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -n$
-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pme$
0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direc$
=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cry$
tomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target syste$
d.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb815582202$
8e8c6e569::off::]" source=virtcontainers subsystem=qmp
time="2017-11-22T00:56:48Z" level=error msg="Container not running, impossible to signal the container" source=runtime
time="2017-11-22T00:56:48Z" level=error msg="Container ID (da9333fbfe74b92b064a5f15877e32376d7e5b67dc1bb81558220228e8c6e569) does no$
 exist" source=runtime

Proxy logs

No recent proxy problems found in system journal.

Shim logs

No recent shim problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker info":

Containers: 3                                                                                                                        
 Running: 0                                                                                                                          
 Paused: 0                                                                                                                           
 Stopped: 3                                                                                                                          
Images: 2                                                                                                                            
Server Version: 17.05.0-ce                                                                                                           
Storage Driver: overlay2                                                                                                             
 Backing Filesystem: extfs                                                                                                           
 Supports d_type: true                                                                                                               
 Native Overlay Diff: true                                                                                                           
Logging Driver: json-file                                                                                                            
Cgroup Driver: cgroupfs                                                                                                              
Plugins:                                                                                                                             
 Volume: local                                                                                                                       
 Network: bridge host macvlan null overlay                                                                                           
Swarm: inactive                                                                                                                      
Runtimes: cc-runtime runc                                                                                                            
Default Runtime: cc-runtime                                                                                                          
Init Binary: docker-init                                                                                                             
containerd version:  (expected: 9048e5e50717ea4497b757314bad98ea3763c145)                                                            
runc version: N/A (expected: 9c2d8d184e5da67c95d601382adf14862e4f2228)                                                               
init version: N/A (expected: )                                                                                                       
Kernel Version: 4.14.1-446.native                                                                                                    
Operating System: Clear Linux OS for Intel Architecture                                                                              
OSType: linux                                                                                                                        
Architecture: x86_64                                                                                                                 
CPUs: 1                                                                                                                              
Total Memory: 3.853GiB                                                                                                               
Name: clr-0552865481254b448d6c45c2a317a0c7                                                                                           
ID: 3GDW:MSCJ:UEOV:P3NJ:TLIC:FRRV:LTHP:2NEP:PNNZ:RR43:PDHX:THOA                                                                      
Docker Root Dir: /var/lib/docker                                                                                                     
Debug Mode (client): false                                                                                                           
Debug Mode (server): false                                                                                                           
Registry: https://index.docker.io/v1/                                                                                                
Experimental: false                                                                                                                  
Insecure Registries:                                                                                                                 
 127.0.0.0/8                                                                                                                         
Live Restore Enabled: false                                                                                                          

No kubectl


Packages

No dpkg
No rpm


jodh-intel commented 6 years ago

I'm running F26 on the host then booting clr using start_qemu.sh clear-19530-kvm.img and starting a Clear Container. The only change I needed to make was to start_qemu.sh to increase the memory assigned to the hypervisor (from 1024 to 4096).

Is this error repeatable for you (for new installations)?

The error is coming from virtcontainers but points to an issue with the agent running inside the guest O/S.

@sboeuf - any ideas?

jodh-intel commented 6 years ago

If it is repeatable, please enable the proxy debug option and re-run to see if you get anything else in the journal: