Open titou10titou10 opened 3 months ago
"must-gather" direct from okd5-master1 when the installation loops, before editing the empty /etc/resolv.conf
file:
ssh core@okd5-master1 sudo /usr/local/bin/agent-gather -O > okd5-master1_agent-gather.tar.gz
okd5-master1_agent-gather.tar.gz
"Must-gather" before rebooting, where all nodes are there except the BS node
export KUBECONFIG=...
oc login ...
oc adm must-gather
OKD version: 4.15.0-0.okd-2024-03-10-010116
Summary
I tried to install OKD on bare metal with the agent installer as described here Globally I succeeded but encountered two problems:
agent-config.yaml
file and it has to be manually (re-)enteredoc get nodes
). All other nodes were there but not the BS one. Manually rebooting the node forced it to finished its initialization and to appear amongst the list of nodesTopoloy
Part of the agent-config.yaml:
The 5 other nodes are on the same pattern
Installation
First problem
After having created the iso image etc, all the 5 nodes are started at the same time and the installation starts The progress is monitored with
Then everything stops. The console of okd5-master1 shows that something is looping:
I then sshed to the node:
So the BS node was not able to continue because it could not download image from quay.io because the
resolv.conf
is empty at this stage ! ("Image quay.io/openshift/okd-content@sha256:... not found")I added the lines from agent-config.yaml in /etc/resolv.conf`and immediatly the installation stops looping and goes on...
and the installation of the 4 other nodes continued and succedded etc..
Second problem
Then the installation stopped again and never finished. After waiting a long time (and all nodes at about 5% cpu...), I managed to open an oc session to okd-master1
oc get nodes
returned the list of all the nodes as "ready" except the BS node (okd5-master1) that was not even in the list. and of courseoc get co
andoc get clusterversion
indicated that many operators were broken because 1/3 of the masters was missing...At this point the status is this:
So I sshed again in okd5-master1 and force a reboot with
shutdown -r now
and tada...the installation of the BS node finished and finally the cluster installation went to the end with all the 5 nodes known to the cluster and "ready"