okd-project / okd

The self-managing, auto-upgrading, Kubernetes distribution for everyone
https://okd.io
Apache License 2.0
1.76k stars 297 forks source link

crc setup does Not download latest crcbundle 4.15.0-0.okd-2024-03-10-010116 #1950

Closed melvinga closed 3 months ago

melvinga commented 5 months ago

Describe the bug Trying to install in Fedora 40 by following this guide: https://fedoramagazine.org/okd-on-fedora-workstation-with-crc/

I see a few issues: issue-1-of-5) "crc setup" command seems to be using crc_okd_libvirt_4.15.0-0.okd-2024-02-23-163410_amd64.crcbundle (on 2024-June-17) when the latest bundle should have been 4.15.0-0.okd-2024-03-10-010116 . This latest bundle (2024-03-10-010116) is said to be uploaded into quay; but the latest in quay is the older 2024-02-23-163410.

issue-2-of-5) dnsmasq issue exists. I had to follow suggestion from mlei in: https://github.com/okd-project/okd/issues/1939 to get around it. Commands i used: in fedora host cli: ssh core@192.168.130.11 -i /home/fedora/.crc/machines/crc/id_ecdsa -i /home/fedora/.crc/cache/crc_okd_libvirt_4.15.0-0.okd-2024-02-23-163410_amd64/id_ecdsa_crc

in container cli: sudo systemctl unmask dnsmasq sudo systemctl status dnsmasq logout

in fedora host cli: crc stop crc start --log-level debug

issue-3-of-5) After resolving the dnsmasq issue, I got error: E0617 10:49:31.275307 7442 memcache.go:265] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-06-17T10:49:31Z is after 2024-04-04T06:12:51Z Unable to connect to the server: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-06-17T10:49:31Z is after 2024-04-04T06:12:51Z DEBU error: Temporary error: ssh command error: command : timeout 5s oc get csr --context admin --cluster crc --kubeconfig /opt/kubeconfig err : Process exited with status 1

issue-4-of-5) I waited until the error in point-3 above progressed to the following: INFO Waiting until the user's pull secret is written to the instance disk... DEBU retry loop: attempt 0 DEBU Running SSH command: DEBU SSH command succeeded DEBU error: Temporary error: pull secret not updated to disk - sleeping 2s

issues-5-of-5) I suppose it doesn't work. $ crc status CRC VM: Running OpenShift: Unreachable (v4.15.0-0.okd-2024-02-23-163410) RAM Usage: 5.123GB of 10.95GB Disk Usage: 23.44GB of 32.68GB (Inside the CRC VM) Cache Usage: 27.69GB Cache Directory: /home/fedora/.crc/cache

Version I'm using Fedora 40 Workstation in Proxmox. Processor type is 'host' (so virtualization is available to crc). RAM: 20GB Disk: 60GB (33 GB available) CPU: 4 core

$ crc config view

$ crc version CRC version: 2.37.1+36d451 OpenShift version: 4.15.14

How reproducible 100% reproducible at my end.

Log bundle This is close the last part of terminal output for "crc start --log-level debug": DEBU error: Temporary error: pull secret not updated to disk - sleeping 2s DEBU retry loop: attempt 86 DEBU Running SSH command: DEBU SSH command succeeded DEBU Waiting for availability of resource type 'secret' DEBU retry loop: attempt 0 DEBU Running SSH command: timeout 5s oc get secret --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: , output: NAME TYPE DATA AGE builder-dockercfg-pxm72 kubernetes.io/dockercfg 1 3m builder-token-25pxd kubernetes.io/service-account-token 4 3m6s default-dockercfg-fpm84 kubernetes.io/dockercfg 1 3m default-token-vdlwk kubernetes.io/service-account-token 4 3m6s DEBU NAME TYPE DATA AGE builder-dockercfg-pxm72 kubernetes.io/dockercfg 1 3m builder-token-25pxd kubernetes.io/service-account-token 4 3m6s default-dockercfg-fpm84 kubernetes.io/dockercfg 1 3m default-token-vdlwk kubernetes.io/service-account-token 4 3m6s DEBU Running SSH command: DEBU SSH command succeeded INFO Changing the password for the kubeadmin user DEBU Running SSH command: DEBU SSH command succeeded DEBU Waiting for availability of resource type 'clusterversion' DEBU retry loop: attempt 0 DEBU Running SSH command: timeout 5s oc get clusterversion --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: , output: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.15.0-0.okd-2024-02-23-163410 True False 104d Cluster version is 4.15.0-0.okd-2024-02-23-163410 DEBU NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.15.0-0.okd-2024-02-23-163410 True False 104d Cluster version is 4.15.0-0.okd-2024-02-23-163410 DEBU Running SSH command: timeout 30s oc get clusterversion version -o jsonpath="{['spec']['clusterID']}" --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: , output: INFO Updating cluster ID... DEBU Running SSH command: timeout 30s oc patch clusterversion version -p '{"spec":{"clusterID":"22ebfad4-2a34-4bfe-989d-5f2a6e40a28b"}}' --type merge --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: , output: clusterversion.config.openshift.io/version patched DEBU Waiting for the renewal of the request header client ca... DEBU retry loop: attempt 0 DEBU Running SSH command: date --date="$(sudo openssl x509 -in /etc/kubernetes/static-pod-resources/kube-apiserver-certs/configmaps/aggregator-client-ca/ca-bundle.crt -noout -enddate | cut -d= -f 2)" --iso-8601=seconds DEBU SSH command results: err: , output: 2024-07-17T11:57:12+00:00 DEBU Waiting for availability of resource type 'pod' DEBU retry loop: attempt 0 DEBU Running SSH command: timeout 5s oc get pod --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: , output: DEBU DEBU retry loop: attempt 0 DEBU Running SSH command: timeout 5s oc delete pod --all --force -n openshift-apiserver --context admin --cluster crc --kubeconfig /opt/kubeconfig DEBU SSH command results: err: Process exited with status 124, output: DEBU error: Temporary error: Failed to delete pod from openshift-apiserver namespace ssh command error: command : timeout 5s oc delete pod --all --force -n openshift-apiserver --context admin --cluster crc --kubeconfig /opt/kubeconfig err : Process exited with status 124 : Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.

Temporary error: Failed to delete pod from openshift-apiserver namespace ssh command error: command : timeout 5s oc delete pod --all --force -n openshift-apiserver --context admin --cluster crc --kubeconfig /opt/kubeconfig err : Process exited with status 1 : Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. The connection to the server api.crc.testing:6443 was refused - did you specify the right host or port? (x33)

JaimeMagiera commented 3 months ago

Hi,

We are not working on FCOS builds of OKD any more. Please see these documents...

https://okd.io/blog/2024/06/01/okd-future-statement https://okd.io/blog/2024/07/30/okd-pre-release-testing

We hope to have a OKD SCOS version of CRC soon. There is no definitive ETA though.

Many thanks,

Jaime