clearcontainers / runtime

OCI (Open Containers Initiative) compatible runtime using Virtual Machines
Apache License 2.0
590 stars 70 forks source link

"message too long" crashes runtime #891

Open dhrp opened 6 years ago

dhrp commented 6 years ago

Description of problem

When I try to start a container with a (very) long docker run command the runtime errors with something like "message too long"

The real issue is that the runtime then also immediately stops working and I can no longer 'docker ps' or start or stop containers on this host until I reboot.

Steps to reproduce:

I have traced the source of the "message too long" to here: https://github.com/containers/virtcontainers/blob/54b8c0cc68933561bc947c8b76aa491bb925f3ae/pkg/hyperstart/hyperstart.go#L345

journalctl thows some errors like:

Jan 04 08:28:07 c1.packet.nlze.nl cc-runtime[3104]: time="2018-01-04T08:28:07Z" level=error msg="Failed to start container" container-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e error="message too long 29282" pod-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e source=virtcontainers subsystem=container
Jan 04 08:28:07 c1.packet.nlze.nl cc-runtime[3104]: time="2018-01-04T08:28:07Z" level=warning msg="Failed to stop container" container-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e error="Container not running, impossible to stop" pod-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e source=virtcontainers subsystem=container
Jan 04 08:28:07 c1.packet.nlze.nl cc-runtime[3104]: time="2018-01-04T08:28:07Z" level=error msg="message too long 29282" source=runtime
Jan 04 08:28:07 c1.packet.nlze.nl dockerd[1069]: time="2018-01-04T08:28:07.246771101Z" level=error msg="containerd: start init process" error="exit status 1: \"message too long 29282\\n\""

Expected result

the container runs

Actual result

see above


root@c1:~# cc-collect-data.sh

Meta details

Running cc-collect-data.sh version 3.0.10 (commit 3d402d1) at 2018-01-04.08:59:52.838419177.


Runtime is /usr/bin/cc-runtime.

cc-env

Output of "/usr/bin/cc-runtime cc-env":

[Meta]
  Version = "1.0.6"

[Runtime]
  Debug = false
  [Runtime.Version]
    Semver = "3.0.10"
    Commit = "3d402d1"
    OCI = "1.0.0-dev"
  [Runtime.Config]
    Path = "/usr/share/defaults/clear-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 2.7.1(2.7.1+git.d4a337fe91-9.cc), Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-lite-system-x86_64"
  Debug = false

[Image]
  Path = "/usr/share/clear-containers/clear-19490-containers.img"

[Kernel]
  Path = "/usr/share/clear-containers/vmlinuz-4.9.60-80.container"
  Parameters = ""

[Proxy]
  Type = "ccProxy"
  Version = "Version: 3.0.10+git.513b073"
  Path = "/usr/libexec/clear-containers/cc-proxy"
  Debug = false

[Shim]
  Type = "ccShim"
  Version = "shim version: 3.0.10 (commit: 0952966)"
  Path = "/usr/libexec/clear-containers/cc-shim"
  Debug = false

[Agent]
  Type = "hyperstart"
  Version = "<<unknown>>"

[Host]
  Kernel = "4.13.0-16-generic"
  CCCapable = true
  [Host.Distro]
    Name = "Ubuntu"
    Version = "16.04"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Atom(TM) CPU  C2550  @ 2.40GHz"

Runtime config files

Runtime default config files

/usr/share/defaults/clear-containers/configuration.toml
/usr/share/defaults/clear-containers/configuration.toml

Runtime config file contents

Config file /etc/clear-containers/configuration.toml not found Output of "cat "/usr/share/defaults/clear-containers/configuration.toml"":

# XXX: Warning: this file is auto-generated from file "config/configuration.toml.in".

[hypervisor.qemu]
path = "/usr/bin/qemu-lite-system-x86_64"
kernel = "/usr/share/clear-containers/vmlinuz.container"
image = "/usr/share/clear-containers/clear-containers.img"
machine_type = "pc"
# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per POD/VM:
# unspecified or 0 --> will be set to 1
# < 0              --> will be set to the actual number of physical cores
# > 0 <= 255       --> will be set to the specified number
# > 255            --> will be set to 255
default_vcpus = -1

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per POD/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for POD/VM.
# If unspecified then it will be set 2048 MiB.
#default_memory = 2048
disable_block_device_use = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically 
# result in memory pre allocation
#enable_hugepages = true

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# Debug changes the default hypervisor and kernel parameters to
# enable debug output where available.
# Default false
# these logs can be obtained in the cc-proxy logs  when the 
# proxy is set to run in debug mode
# /usr/libexec/clear-containers/cc-proxy -log debug
# or by stopping the cc-proxy service and running the cc-proxy 
# explicitly using the same command line
# 
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
# 
#disable_nesting_checks = true

[proxy.cc]
path = "/usr/libexec/clear-containers/cc-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.cc]
path = "/usr/libexec/clear-containers/cc-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true

Logfiles

Runtime logs

Recent runtime problems found in system journal:

time="2018-01-04T08:22:04Z" level=info msg="launching qemu with: [-name pod-e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a -uuid 933ba436-8a25-4a2d-8f48-cbad6e7d4ef6 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host -qmp unix:/run/virtcontainers/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a/933ba436-8a25-4a2,server,nowait -qmp unix:/run/virtcontainers/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a/933ba436-8a25-4a2,server,nowait -m 2048M,slots=2,maxmem=8991M -smp 4,cores=4,threads=1,sockets=1 -device virtio-9p-pci,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p-0,path=/var/lib/docker/overlay2/f916ea6641641c3ffe6ade147083afd67253a030e49afd7f3ee015ef6eb80866/merged,security_model=none -device virtio-serial-pci,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/virtcontainers/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19490-containers.img,size=235929600 -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a/hyper.sock,server,nowait -device virtserialport,chardev=charch1,id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a/tty.sock,server,nowait -device virtio-9p-pci,fsdev=extra-9p-hyperShared,mount_tag=hyperShared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cryptomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a::off::]" source=virtcontainers subsystem=qmp
time="2018-01-04T08:22:06Z" level=warning msg="unsupported route" destination="fe80::/64" source=virtcontainers subsystem=hyper unsupported-route-type=ipv6
time="2018-01-04T08:22:06Z" level=error msg="Container not running, impossible to signal the container" source=runtime
time="2018-01-04T08:22:07Z" level=error msg="Container ID (e66afb14c5a6002fb41ffe2b57f4c160f5120191a7078f3f2de021d45500022a) does not exist" source=runtime
time="2018-01-04T08:22:12Z" level=info msg="launching qemu with: [-name pod-1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65 -uuid 26431fa5-aee4-4560-93a2-66588c60437f -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host -qmp unix:/run/virtcontainers/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65/26431fa5-aee4-456,server,nowait -qmp unix:/run/virtcontainers/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65/26431fa5-aee4-456,server,nowait -m 2048M,slots=2,maxmem=8991M -smp 4,cores=4,threads=1,sockets=1 -device virtio-9p-pci,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p-0,path=/var/lib/docker/overlay2/c5eb50c23ac71bbaccd7165c3f5cc8c8be92c993c187b31781edb972bf2a14d1/merged,security_model=none -device virtio-serial-pci,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/virtcontainers/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19490-containers.img,size=235929600 -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65/hyper.sock,server,nowait -device virtserialport,chardev=charch1,id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65/tty.sock,server,nowait -device virtio-9p-pci,fsdev=extra-9p-hyperShared,mount_tag=hyperShared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cryptomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65::off::]" source=virtcontainers subsystem=qmp
time="2018-01-04T08:22:14Z" level=warning msg="unsupported route" destination="fe80::/64" source=virtcontainers subsystem=hyper unsupported-route-type=ipv6
time="2018-01-04T08:22:14Z" level=error msg="Container not running, impossible to signal the container" source=runtime
time="2018-01-04T08:22:14Z" level=error msg="Container ID (1460e238a1bb2010a5df0c9acc52d37f04673cb9bba0d65c517360cd23727a65) does not exist" source=runtime
time="2018-01-04T08:28:05Z" level=info msg="launching qemu with: [-name pod-c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e -uuid dd2e4284-d888-47e9-a675-254a89e0309c -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host -qmp unix:/run/virtcontainers/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e/dd2e4284-d888-47e,server,nowait -qmp unix:/run/virtcontainers/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e/dd2e4284-d888-47e,server,nowait -m 2048M,slots=2,maxmem=8991M -smp 4,cores=4,threads=1,sockets=1 -device virtio-9p-pci,fsdev=ctr-9p-0,mount_tag=ctr-rootfs-0 -fsdev local,id=ctr-9p-0,path=/var/lib/docker/overlay2/bbdf37af4db6d656ff42d222ed760bed0c37baf31c54ec6d66ab878e8952457b/merged,security_model=none -device virtio-serial-pci,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/virtcontainers/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/usr/share/clear-containers/clear-19490-containers.img,size=235929600 -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on -device virtserialport,chardev=charch0,id=channel0,name=sh.hyper.channel.0 -chardev socket,id=charch0,path=/run/virtcontainers/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e/hyper.sock,server,nowait -device virtserialport,chardev=charch1,id=channel1,name=sh.hyper.channel.1 -chardev socket,id=charch1,path=/run/virtcontainers/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e/tty.sock,server,nowait -device virtio-9p-pci,fsdev=extra-9p-hyperShared,mount_tag=hyperShared -fsdev local,id=extra-9p-hyperShared,path=/tmp/hyper/shared/pods/c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e,security_model=none -netdev tap,id=network-0,vhost=on,fds=3:4:5:6:7:8:9:10 -device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,mq=on,vectors=18 -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -kernel /usr/share/clear-containers/vmlinuz-4.9.60-80.container -append root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug iommu=off cryptomgr.notests net.ifnames=0 quiet systemd.show_status=false init=/usr/lib/systemd/systemd systemd.unit=clear-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket ip=::::::c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e::off::]" source=virtcontainers subsystem=qmp
time="2018-01-04T08:28:07Z" level=warning msg="unsupported route" destination="fe80::/64" source=virtcontainers subsystem=hyper unsupported-route-type=ipv6
time="2018-01-04T08:28:07Z" level=error msg="Failed to start container" container-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e error="message too long 29282" pod-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e source=virtcontainers subsystem=container
time="2018-01-04T08:28:07Z" level=warning msg="Failed to stop container" container-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e error="Container not running, impossible to stop" pod-id=c3f94f1e50020520507db7c2f1ac305489bafe7357c5e9194ae9e3a8b3c9e61e source=virtcontainers subsystem=container
time="2018-01-04T08:28:07Z" level=error msg="message too long 29282" source=runtime

Proxy logs

No recent proxy problems found in system journal.

Shim logs

No recent shim problems found in system journal.


Container manager details

Have docker

Docker

Output of "docker info":

Containers: 7
 Running: 0
 Paused: 0
 Stopped: 7
Images: 3
Server Version: 17.03.2-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: cc-runtime runc
Default Runtime: cc-runtime
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.13.0-16-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.781 GiB
Name: c1.packet.nlze.nl
ID: LR3L:7FNI:AIJQ:OSJ2:435Y:WLKC:HMLT:R5D4:TWIW:V5KR:NTJI:AHMB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 24
 Goroutines: 44
 System Time: 2018-01-04T08:59:52.941860501Z
 EventsListeners: 1
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

No kubectl


Packages

Have dpkg Output of "dpkg -l|egrep "(cc-proxy|cc-runtime|cc-shim|clear-containers-image|linux-container|qemu-lite|qemu-system-x86|cc-oci-runtime)"":

ii  cc-proxy                       3.0.10+git.513b073-15                 amd64        
ii  cc-runtime                     3.0.10+git.3d402d1-15                 amd64        
ii  cc-runtime-bin                 3.0.10+git.3d402d1-15                 amd64        
ii  cc-runtime-config              3.0.10+git.3d402d1-15                 amd64        
ii  cc-shim                        3.0.10+git.0952966-15                 amd64        
ii  clear-containers-image         19490-41                              amd64        Clear containers image
ii  linux-container                4.9.60-80                             amd64        linux kernel optimised for container-like workloads.
ii  qemu-lite                      2.7.1+git.d4a337fe91-9                amd64        linux kernel optimised for container-like workloads.

No rpm


sboeuf commented 6 years ago

@dhrp nice catch ! The culprit you're pointing here: https://github.com/containers/virtcontainers/blob/54b8c0cc68933561bc947c8b76aa491bb925f3ae/pkg/hyperstart/hyperstart.go#L345 is clearly a legacy constraint. We don't rely on the same agent anymore and there is no reason to keep this check IMO. Could you first test that it works for you if you remove this part, and if that's the case, could you submit a proper PR to fix this issue ?

dhrp commented 6 years ago

Perhaps you can give me some pointers to how to include / replace this, and rebuild as it is not obvious to me where the source comes from / how it gets into the pkg directory.

https://github.com/containers/virtcontainers/tree/54b8c0cc68933561bc947c8b76aa491bb925f3ae/pkg/hyperstart

I'm on #clearcontainers irc as thatcher

jodh-intel commented 6 years ago

Hi @dhrp - please don't feel you have to raise a PR, but if you'd like to contribute, that would be awesome.

What you'll need to do is:

If that all works, you can then raise a PR on virtcontainers (not the runtime) to remove the 10240 check at: https://github.com/containers/virtcontainers/pulls

Once that fix lands in virtcontainers, we can then raise a PR on the runtime to "re-vendor" (update) the version of virtcontainers used by the runtime. At that point the bug will have been fully fixed.

I've raised an issue on our tests repository so we don't forget to create a test to avoid this problem re-occurring: https://github.com/clearcontainers/tests/issues/817.

I'm also aware that the dev doc referenced above is missing the docker config instructions wrt the runtime I've outlined above, so I'll raise a PR to get that added...

jodh-intel commented 6 years ago

Dev guide update PR raised: #893.

dhrp commented 6 years ago

Thanks for the help @jodh-intel; it was confusing to see a directory pkg that is not a vendor dir; that's why..

And I have good news and bad news:

The good news is that with the latest (master) the error occurs but no longer completely crashes the runtime. It just says "message too long" (not a huge issue)_

The bad news is that simply uncommenting the length check doesn't solve the problem, in fact. if I uncomment that line (and recompile), the system will crash and halt everything (again).

jodh-intel commented 6 years ago

Hi @dhrp - yes, there is more to this than we initially thought. In fact, the length check needs to be removed from the runtime, the agent and the proxy. But an env var > 2995 bytes kills the shim fwics:

$ sudo apt-get -y install utfout
$ export LC_ALL=C
$ docker run -e FOO=$(utfout a -r 2994) -ti busybox true
$ docker run -e FOO=$(utfout a -r 2995) -ti busybox true
handle_proxy_response:616:Error response received from proxy at /run/virtcontainers/pods/2b2136e05eff540ff854d76448b512132b8027132b3be77f19f0d4612c08985c/proxy.sock: {"msg":"vm: unknown token 4IgNPn14pYSZXjTQtgAnXtIQ2YHvuU5nU5bvZQkQq0s="}
/usr/libexec/clear-containers/cc-shim: Shim received an error in responseto ConnectShim command, exiting

Could you take a look @amshinde?

jodh-intel commented 6 years ago

re-ping @amshinde :)