Closed hunter86bg closed 3 years ago
@hunter86bg Can you also put the logs what is in the <SNIP>
because first we use the key which is part of bundle for authentication and then generate the new key which is part of .crc/machine/crc/id_rsa
.
I've run:
[snikolov@openshift1 ~]$ crc delete -f
[snikolov@openshift1 ~]$ crc cleanup
[snikolov@openshift1 ~]$ crc setup
[snikolov@openshift1 ~]$ crc config set pull-secret-file /home/snikolov/secret
[snikolov@openshift1 ~]$ crc config view
- pull-secret-file : /home/snikolov/secret
[snikolov@openshift1 ~]$ crc start --log-level debug &> debuglog
For me there is a problem with the VM, as it couldn't detect the IP.
Any idea what is going on ?
It looks like the VM is not starting for some reason. Can you check whether it's running or not with virsh -c qemu:///system list --all
? If it's not running, I'd look at the logs in /var/log/libvirt/qemu/crc.log
. If it's running, you should be able to connect to it with virt-viewer -c qemu:///system crc
to see its display.
Actually you are right, the VM is frozen . I will try to debug it further.
Something in the VM definition is not OK with the libvirt/qemu on EL8 and it fails to properly boot. I have removed the VM's disks, created new and reinstalled vanilla CentOS7 and it seems to be working:
[snikolov@openshift1 ~]$ crc start
INFO Checking if running as non-root
INFO Checking if podman remote executable is cached
INFO Checking if admin-helper executable is cached
INFO Checking minimum RAM requirements
INFO Checking if Virtualization is enabled
INFO Checking if KVM is enabled
INFO Checking if libvirt is installed
INFO Checking if user is part of libvirt group
INFO Checking if libvirt daemon is running
INFO Checking if a supported libvirt version is installed
INFO Checking if crc-driver-libvirt is installed
INFO Checking if systemd-networkd is running
INFO Checking if NetworkManager is installed
INFO Checking if NetworkManager service is running
INFO Checking if /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf exists
INFO Checking if /etc/NetworkManager/dnsmasq.d/crc.conf exists
INFO Checking if libvirt 'crc' network is available
INFO Checking if libvirt 'crc' network is active
? Image pull secret [? for help] **********************************
INFO Loading bundle: crc_libvirt_4.6.9.crcbundle ...
INFO Verifying bundle crc_libvirt_4.6.9.crcbundle ...
INFO Creating CodeReady Containers VM for OpenShift 4.6.9...
INFO CodeReady Containers VM is running
INFO Generating new SSH Key pair ...
INFO Updating authorized keys ...
INFO Copying kubeconfig file to instance dir ...
INFO Starting network time synchronization in CodeReady Containers VM
INFO Check internal and public DNS query ...
INFO Check DNS query from host ...
INFO Adding user's pull secret to instance disk...
INFO Verifying validity of the kubelet certificates ...
INFO Starting OpenShift kubelet service
INFO Adding user's pull secret to the cluster ...
INFO Updating cluster ID ...
INFO Starting OpenShift cluster ... [waiting 3m]
INFO Updating kubeconfig
WARN Skipping the kubeconfig update. Cluster operator authentication still not ready after 2min.
WARN The cluster might report a degraded or error state. This is expected since several operators have been disabled to lower the resource usage. For more information, please consult the documentation
Started the OpenShift cluster
To access the cluster, first set up your environment by following the instructions returned by executing 'crc oc-env'.
Then you can access your cluster by running 'oc login -u developer -p developer https://api.crc.testing:6443'.
To login as a cluster admin, run 'oc login -u kubeadmin -p D4tTL-hJYdy-qe36Y-okTwh https://api.crc.testing:6443'.
You can also run 'crc console' and use the above credentials to access the OpenShift web console.
The console will open in your default browser.
Something in the VM definition is not OK with the libvirt/qemu on EL8 and it fails to properly boot.
The libvirtd logs/VM logs would have been useful to try to figure this out :) I have crc running fine on EL8 on my daily test machine.
My setup is quite specific:
As CentOS 7 works fine (same VM), the VM hosting the 'crc' is OK. I checked the 'crc' VM is using host-passthrough , so the cpu type is OK.
I will spin a RHEL 8.3 to check if it's reproducable there.
It's definately not a distro issue. Same behaviour on RHEl 8.3.
After I touched the CPU settings, now the Vm stops at 7th second of the boot process (previously it was 0.3s).
Would be useful to see what's going on in the VM, can you try https://github.com/code-ready/crc/issues/1853#issuecomment-764805370 or https://github.com/code-ready/crc/issues/1853#issuecomment-766761975 ?
It just stucks. With default xml on 0.3s of the boot and with reduced cpu count it stucks on 7th second
I will try to provide more details once I get any progress
It just stucks. With default xml on 0.3s of the boot and with reduced cpu count it stucks on 7th second
This doesn't tell much about which step exactly it's stuck on ;) By any chance, is this happening with a ryzen CPU? And your latest tests were on non-nested setups?
I can provide screenshots when I'm back. Still nested as I don't have spare hardware and the hosts are too small for oVirt/RHV based deployment of Openshift. VM is using host-passthrough (FX-8350) and has all necessary flags for Virtualization.
On AMD, I'd recommend making sure you are using the latest bios version, this has been causing issues on some systems.
On EL7 the following is generated and works without any issues:
<domain type='kvm' id='1'>
<name>crc</name>
<uuid>95b9eb62-8d28-4ef8-99c9-7160773e2eaf</uuid>
<memory unit='KiB'>9000960</memory>
<currentMemory unit='KiB'>9000960</currentMemory>
<vcpu placement='static'>4</vcpu>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
<boot dev='hd'/>
<bootmenu enable='no'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<cpu mode='host-passthrough' check='none'>
<feature policy='disable' name='rdrand'/>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' io='threads'/>
<source file='/home/snikolov/.crc/machines/crc/crc.qcow2'/>
<backingStore type='file' index='1'>
<format type='qcow2'/>
<source file='/home/snikolov/.crc/cache/crc_libvirt_4.6.9/crc.qcow2'/>
<backingStore/>
</backingStore>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='0' model='pci-root'>
<alias name='pci.0'/>
</controller>
<interface type='network'>
<mac address='52:fd:fc:07:21:82'/>
<source network='crc' bridge='crc'/>
<target dev='vnet0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/1'/>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/1'>
<source path='/dev/pts/1'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='mouse' bus='ps2'>
<alias name='input0'/>
</input>
<input type='keyboard' bus='ps2'>
<alias name='input1'/>
</input>
<graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1'/>
</graphics>
<video>
<model type='cirrus' vram='16384' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='none'/>
<rng model='virtio'>
<backend model='random'>/dev/urandom</backend>
<alias name='rng0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</rng>
</devices>
<seclabel type='dynamic' model='selinux' relabel='yes'>
<label>system_u:system_r:svirt_t:s0:c348,c364</label>
<imagelabel>system_u:object_r:svirt_image_t:s0:c348,c364</imagelabel>
</seclabel>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+107:+107</label>
<imagelabel>+107:+107</imagelabel>
</seclabel>
</domain>
Dropping the count of vCPUs of the 'crc' VM to 1 allows me to reach:
Original config stucks at:
As you can see it can't reach the switchroot.
On EL7 the following is generated and works without any issues:
So EL7 + this xml + nested virt works fine, but EL8 + default xml + nested virt fails? Have you tried the el7 xml on the EL8 machine? kernel/qemu will be different between EL7 and EL8, this makes it hard to know if it works because of the XML, or because of the different kernel/qemu.
I already tried theEL7 default config on nested EL8 -> doesn't work.
This looks like a kernel/qemu/hypervisor/cpu related nested virtualization issue, I'm afraid we won't be able to do much for it in crc :-( Double check https://github.com/code-ready/crc/issues/1853#issuecomment-775252444 in case it magically solves it.
I will try to deploy another VM in that ovirt virtual Hypervisor and double check if it's crc related or nested virt issue.
Sadly you are right, nested virtualization is not working. According to https://access.redhat.com/solutions/58204 Nested Virtualization for AMD CPUs is in tech preview and requires the Bare metal host to be also working on EL 8
I guess we can leave this issue open for other users.
For what it's worth, I've been having success with RHEL8 host + nested virt on a ryzen 3600 CPU, so it's tech preview, but it can work well :) Check again for bios updates if you haven't done so yet, this has been helping some other people trying crc with nested virt.
Sadly the hardware is too old and has no bios updates . I believe that it's due to the fact that the Hypervisor is EL7 with netsted Hypervisor is EL 8 ...
Closing, as this is not an issue with CRC.
General information
crc setup
before starting it: YesCRC version
CRC status
CRC config
Host Operating System
Steps to reproduce
Expected
crc starts properly
Actual
crc fails to start and connect
Logs
Please consider posting the output of
crc start --log-level debug
on http://gist.github.com/ and post the link in the issue.