Closed GabyCT closed 5 years ago
Just to check - could this be the same vsock issue we have been debating: https://github.com/kata-containers/runtime/issues/1203 Is that Azure kernel maybe been updated etc.?
@grahamwhaley , I am using Ubuntu 16.04 :S
I think the premise may still hold - if they've updated the kernel with the patch that busts vsock for us.... then you will hit the problem.... 16.04 is still in support I believe, so updates will happen I think.
uname -a
Linux firecracker 4.15.0-1037-azure #39~16.04.1-Ubuntu SMP Tue Jan 15 17:20:47 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
@GabyCT try updating to 4.18.0-1007
@devimc , it is not available
$ sudo apt-get install linux-image-4.18-azure
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package linux-image-4.18-azure
E: Couldn't find any package by glob 'linux-image-4.18-azure'
E: Couldn't find any package by regex 'linux-image-4.18-azure'
The highest is 4.15
@GabyCT I can confirm that latest azure ubuntu 16.04 doesn't work when running with vsocks.
The same error is happening on Fedora 28 with Linux gabyct-f28 4.19.13-200.fc28.x86_64
docker: Error response from daemon: OCI runtime create failed: Failed to check if grpc server is working: context deadline exceeded: unknown.
The logs at Fedora
time="2019-02-08T17:53:36.459892062Z" level=info msg="No info could be fetched" arch=amd64 command=create container=c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641 error="open /run/vc/sbs/c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641/hypervisor.json: no such file or directory" function=init name=kata-runtime pid=122354 source=virtcontainers subsystem=firecracker
time="2019-02-08T17:53:36.464684481Z" level=warning msg="fetch sandbox device failed" arch=amd64 command=create container=c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641 error="open /run/vc/sbs/c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641/devices.json: no such file or directory" name=kata-runtime pid=122354 sandbox=c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641 sandboxid=c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641 source=virtcontainers subsystem=sandbox
time="2019-02-08T17:53:52.10947019Z" level=error msg="Failed to check if grpc server is working: context deadline exceeded" arch=amd64 command=create container=c49f0f9f60262642f5efa1f38d9cac308a3910fc7162e34d26842ce25ce45641 name=kata-runtime pid=122354 source=runtime
I also tried with 4.16 and the same error is present.
I tried with Firecracker version 0.14.0 on Fedora 28 with Linux gabyct-f28 4.19.13-200.fc28.x86_64 and the same error.
OK @GabyCT - if we are sure this is failing on a system that does not have the kernel problem, then we need to do two things:
1) Try to debug this a step further. If you enable all the kata debug options in the config file, do you find anything in the journalctl
logs that might give us a clue?
2) Pull in the kata fc experts.... @mcastelino @sboeuf @egernst
Let's start to narrow in and focus on a single case (at a time) to either eliminate a setup/distro/kernel version or get closer to the core issue.
/cc @sameo
I tried with Firecracker version 0.14.0 on Fedora 28 with Linux gabyct-f28 4.19.13-200.fc28.x86_64 and the same error.
I did the same with qemu + vsock on Fedora 28 with 4.19 and is not working so maybe everything is related with the kernel
@grahamwhaley and @chavafg so it looks like it is the kernel because I tried it with Ubuntu 18.04 on a VM in Azure with Linux ubuntunew 4.18.0-1008-azure #8~18.04.1-Ubuntu SMP Wed Jan 16 15:40:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
but with QEMU + vsock and it is failing with the same error.
I'll close this issue as it is related with https://github.com/kata-containers/runtime/issues/1203 and I can confirm that Firecracker is working on Ubuntu 18.04 with Linux testfire 4.18.0-1007-azure
While doing
We are getting the following error
Looking the logs from the runtime, I see
Here it is the kata-collect info
Meta details
Running
kata-collect-data.sh
version1.5.0 (commit d3c63e66e30fac72ff70190e9e8f166b3896880e)
at2019-02-08.15:23:31.406377910+0000
.Runtime is
/usr/local/bin/kata-runtime
.kata-env
Output of "
/usr/local/bin/kata-runtime kata-env
":Runtime config files
Runtime default config files
Runtime config file contents
Config file
/etc/kata-containers/configuration.toml
not found Output of "cat "/usr/share/defaults/kata-containers/configuration.toml"
":KSM throttler
version
Output of "
--version
":systemd service
Image details
Initrd details
No initrd
Logfiles
Runtime logs
Recent runtime problems found in system journal:
Proxy logs
No recent proxy problems found in system journal.
Shim logs
No recent shim problems found in system journal.
Throttler logs
No recent throttler problems found in system journal.
Container manager details
Have
docker
Docker
Output of "
docker version
":Output of "
docker info
":Output of "
systemctl show docker
":Have
kubectl
Kubernetes
Output of "
kubectl version
":Output of "
kubectl config view
":Output of "
systemctl show kubelet
":Have
crio
Output of "crio --version
":Output of "
systemctl show crio
":Packages
Have
dpkg
Output of "dpkg -l|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"
":Have
rpm
Output of "rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"
":