crc-org / crc

CRC is a tool to help you run containers. It manages a local OpenShift 4.x cluster, Microshift or a Podman VM optimized for testing and development purposes
https://crc.dev
Apache License 2.0
1.25k stars 236 forks source link

Failed to connect to the CRC VM with SSH on Centos7/KVM (nested virtualization) #1028

Closed cgoguyer closed 3 years ago

cgoguyer commented 4 years ago

General information

CRC version

crc version: 1.6.0+8ef676f
OpenShift version: 4.3.0 (embedded in binary)

CRC status

ERRO error: stat /home/crc/.crc/machines/crc/kubeconfig: no such file or directory
 - exit status 1

CRC config

None

Host Operating System

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Steps to reproduce

  1. crc setup
  2. crc start

Expected

Succesfull start of crc

Actual

Failed to connect to the CRC VM with SSH

Logs

start.log

crc.log

praveenkumar commented 4 years ago

@cgoguyer you hit the bug which we are also hitting on our CI [0], by any chance you are using a cloud provider to run crc or it is a bare metal machine? Can you also try to do following and see if it is reproducible?

$ crc delete 
$ crc start

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1803130

cgoguyer commented 4 years ago

@praveenkumar : Yes it's reproductible. crc delete and crc start do not help. crc never started. I'm currently using a VM server.

praveenkumar commented 4 years ago

@cgoguyer Can you also try following and put the logs here just to make sure you are hitting the same issue like we are seeing?

<Terminal-1> $ crc start
<Terminal-2> $ watch sudo virsh list
---- As soon as you see the crc VM is starting ---- 
Exit from the watch [ Ctrl + c ]
<Terminal-2>  sudo virsh console crc

Attach the console output as much as possible.

cgoguyer commented 4 years ago

@praveenkumar : I got no output from "virsh console crc"

Terminal 1:

[crc@srv-devtest02-d ~]$ crc start
INFO Checking if oc binary is cached
INFO Checking if running as non-root
INFO Checking if Virtualization is enabled
INFO Checking if KVM is enabled
INFO Checking if libvirt is installed
INFO Checking if user is part of libvirt group
INFO Checking if libvirt is enabled
INFO Checking if libvirt daemon is running
INFO Checking if a supported libvirt version is installed
INFO Checking if crc-driver-libvirt is installed
INFO Checking if libvirt 'crc' network is available
INFO Checking if libvirt 'crc' network is active
INFO Checking if NetworkManager is installed
INFO Checking if NetworkManager service is running
INFO Checking if /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf exists
INFO Checking if /etc/NetworkManager/dnsmasq.d/crc.conf exists
INFO Starting CodeReady Containers VM for OpenShift 4.3.0...
ERRO Failed to connect to the CRC VM with SSH

Terminal 2:

[root@srv-devtest02-d ~]# watch sudo virsh list
 ID    Nom                            État
----------------------------------------------------
 9     crc                            en cours d'exécution
[root@srv-devtest02-d ~]# sudo virsh console crc
Connected to domain crc
Escape character is ^]
praveenkumar commented 4 years ago

I got no output from "virsh console crc"

This shouldn't be the case, have you started that console as soon as the crc vm show in the virsh side or you started it once CRC failed? Also how long have you waited? Also try to press enter at least it will show the login screen if you are late for open console.

gbraad commented 4 years ago

Can you check what is on the console with the graphical interface?

cgoguyer commented 4 years ago

@gbraad : I'm using a linux server without GUI

gbraad commented 4 years ago

We actually do not support headless setups as this also means people want remote access. We do not expose the VMs endpoints for external use.

Have you tried CRC on a desktop system?

gbraad commented 4 years ago

I'm currently using a VM server.

You mean you use nested virtualization?

cgoguyer commented 4 years ago

Yes, I'm using nested virtualization. I've installed GUI, CRC start, i saw some activity on virtual manager, but only "Guest has not initialized the display (yet)" on kvm console.

praveenkumar commented 4 years ago

@cgoguyer can you put a screen shot what you see on the virt-manager side?

gbraad commented 4 years ago

does any other VM start in this setup, like a stock CentOS or Ubuntu alongside of the CRC RHCOS image?

cgoguyer commented 4 years ago

@gbraad : No, i've tried to create other VM with virt-manager (from another linux server which has GUI installed) and it crashed with the same error. It seems to have an issue with nested kvm on this server ! I assume that kvm should work fine on headless server ... still searching the solution ...

cfergeau commented 4 years ago

Maybe kernel log would give a hint, maybe libvirt VM logs in /var/log/libvirt/qemu would have more details. For what it's worth, you can run virt-manager or virt-viewer and connect to a remote gui-less server, and still have access to the graphical VM console through spice or vnc.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

anjannath commented 4 years ago

@cgoguyer Can you please provide the logs as suggested in https://github.com/code-ready/crc/issues/1028#issuecomment-601134662

dlmorais-13 commented 4 years ago

Any news on that issue? I'm having almost the same problem, same symptoms, only difference is that I'm using CentOS 8.

cfergeau commented 4 years ago

Also using nested virtualization? Would be useful extracting logs from the crc VM if there's nothing relevant in the host logs.

dlmorais-13 commented 4 years ago

@cfergeau it is also nested virtualization. I'm running a CentOS 8 on a VM inside XenServer. The logs from crc vm (/var/log/libvirtd/qemu/crc.log) shows nothing relevant AFAIK.

2020-06-30 20:20:59.720+0000: starting up libvirt version: 4.5.0, package: 42.module_el8.2.0+320+13f867d7
(CentOS Buildsys <bugs@centos.org>, 2020-05-28-17:13:31, ), qemu version: 2.12.0qemu-kvm-2.12.0-
99.module_el8.2.0+320+13f867d7, kernel: 4.18.0-193.el8.x86_64, hostname: vlcd-crcontainer
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none 
/usr/libexec/qemu-kvm -name guest=crc,debug-threads=on -S -object
 secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/
domain-21-crc/master-key.aes -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=
off -cpu host,rdrand=off,kvm=off -m 8790 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1
 -uuid 711028d2-cdbc-41fe-a836-5a37f17c932c -no-user-config -nodefaults -chardev 
socket,id=charmonitor,fd=29,server,nowait -mon chardev=charmonitor,id=monitor,mode=control 
-rtc base=utc -no-shutdown -boot menu=off,strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,
addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive 
file=/home/openshift/.crc/machines/crc/crc,format=qcow2,if=none,id=drive-virtio-disk0,aio=threads 
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 
-netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -device virtio-net-
pci,netdev=hostnet0,id=net0,mac=52:fd:fc:07:21:82,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 
-device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charchannel0 -device virtserialport,
bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 
-vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,
id=balloon0,bus=pci.0,addr=0x6 -object rng-random,id=objrng0,filename=/dev/urandom 
-device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -sandbox on,obsolete=deny,
elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
2020-06-30 20:20:59.720+0000: Domain id=21 is tainted: host-cpu
2020-06-30T20:20:59.782532Z qemu-kvm: -chardev pty,id=charserial0: char device redirected 
to /dev/pts/0 (label charserial0)
2020-06-30T20:20:59.782693Z qemu-kvm: -chardev pty,id=charchannel0: char device redirected 
to /dev/pts/1 (label charchannel0)
2020-06-30T20:20:59.815844Z qemu-kvm: -device cirrus-vga,id=video0,bus=pci.0,addr=0x2: warning: 'cirrus-vga' is 
deprecated, please use a different VGA card instead

I'm no expert with this all, so I'm not completly sure if I'm checking all the relevant logs.

The things I can see is:

I also tried to create a CoreOS VM using instructions from this url: https://coreos.com/os/docs/latest/booting-with-libvirt.html The result is exactly the same as the one described above.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ligius- commented 3 years ago

Same issue, using a KVM environment with AMD Opteron (remote hosting)

https://gist.github.com/ligius-/3f681998d4da8cdf2ae63b395864b991

The VM hangs on booting, similar to how it hanged when I used RHCOS instead of CentOS. It booted fine under Ubuntu with the same virtualized machine.

$ sudo virsh console crc
Connected to domain crc
Escape character is ^]
error: internal error: character device console0 is not using a PTY

EDIT: The CPU configuration is set to host-passthrough (host-model), but I've tried the other options in virt-manager as well. It does get stuck at ~9 seconds if only one core is enabled, instead of <3 seconds with the default 4 cores.

cfergeau commented 3 years ago

Nested virt on AMD machines has been problematic in the past. Make sure you are using the latest bios available. What is the L0 host?