crc-org / snc

Single Node Cluster creation scripts for OpenShift 4.x as used by CodeReady Containers
https://crc.dev
Apache License 2.0
100 stars 49 forks source link

[libvirt] Can't finish installing onn ppc64le #339

Open ildar opened 3 years ago

ildar commented 3 years ago

Installation advanced quite far but cannot finish. Bootstrap is destroyed already but Console web can't start.

  1. ~/snc/openshift-clients/linux/oc get pods --all-namespaces | grep console shows that some pods in openshift-console namespace are in CrashLoopBackOff state
  2. digging down the rabbit hole.
    oc project openshift-console
    oc describe pod console-556b65bdbf-krt5b

    shows this error:

    E0221 06:09:07.526828 1 auth.go:235] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps-crc.testing/oauth/token failed: Head "https://oauth-openshift.apps-crc.testing": dial tcp: lookup oauth-openshift.apps-crc.testing on 10.217.4.10:53: no such host

  3. I finally found that it is documented: https://github.com/openshift/installer/issues/1382#issuecomment-470911798 and https://github.com/openshift/installer/tree/master/docs/dev/libvirt#console-doesnt-come-up

Shouldn't it be fixed in crc-xxxxx network configuration?

cfergeau commented 3 years ago
I finally found that it is documented: openshift/installer#1382 (comment) and https://github.com/openshift/installer/tree/master/docs/dev/libvirt#console-doesnt-come-up

Shouldn't it be fixed in crc-xxxxx network configuration?

$ cat /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
address=/.apps.tt.testing/192.168.126.51

configures the host system to:

  1. use the dnsmasq instance ran by the crc-xxxx libvirt network in order to resolve *.tt.testing (there are only a handful of domains to resolve in that subdomain, which are all present in the libvirt network definition
  2. since we can't have wildcard resolution in a libvirt network (*.apps.tt.testing), and since we can't predict which hostnames will need to de resolved in .apps.tt.testing, this bypasses the libvirt network and directly resolves these hostnames to the single node VM

Maybe 2. can be done differently now that it's possible to have dnsmasq passthrough options in libvirt network xml https://libvirt.org/formatnetwork.html#elementsNamespaces

ildar commented 3 years ago
  1. Correct me if I'm wrong, but openshift.conf is created only in createdisk.sh, which isn't called before snc.sh
  2. This isn't an app's domain name, it is OpenShift Web Console's name. Only one per SNC installation.
  3. (WRONG, see below) --- It is quite strange but its address doesn't fit to the declared .crc.testing pattern (found in crc-snc.conf). It has dash but not dot: https://oauth-openshift.apps-crc.testing --- What did go wrong?
ildar commented 3 years ago

Emm, sorry, (3) was pointless. Adding:

  1. this entry (https://oauth-openshift.apps-crc.testing) is required for installation to complete. So must be added to snc.sh
cfergeau commented 3 years ago
  1. Correct me if I'm wrong, but openshift.conf is created only in createdisk.sh, which isn't called before snc.sh

It is done near the beginning of snc.sh https://github.com/code-ready/snc/blob/0c3b4aab068174aaee09b13c8ecfd85b14b11851/snc.sh#L96-L102

The networking configuration snc does is really basic, and follows openshift-install documentation. It's probably possible to make it better (it does not have to /add checks when it's not good enough/... Initially snc.sh had no checks at all for network configuration. Some basic ones were added to make thing more 'user-friendly'. Patches are welcome to improve this some more :)