Closed gitsridhar closed 3 years ago
@gitsridhar have you used the right RHCOS image with the right code base (eg. rhcos 4.5 with release-4.5 )
@bpradipt I need 4.6, so used release-4.6 branch of the repo with newly uploaded coreos 4.6 image (rhcos-4.6.1-ppc64le).
What is the next step here?
We need the console messages (bootstrap node to start with) to find out what is the reason of not picking the ign files. Could be DHCP issue, ignition version, disk failure, etc.
Yussuf, in lon06, all nodes (bootstrap, master and worker nodes) are in a loop. In bootstrap, it reboots in a loop and in bootstrap as well as master/worker nodes I see this message: 'A start job is running for Ignition (fetch-offline)' and this goes on for 5 minutes followed by another 5 minute silence followed by reboot.
This is the only cluster in lon06, can you access it and look?
This is the only cluster in lon06, can you access it and look?
I do not have access to your resource group. Please check for error messages why the node is going into emergency reboot?
Yussuf, I have access to console of these nodes from web interface, how else can I check the error messages? I copy pasted what I could from web console of these failing nodes above.
Can you please paste the screen shot of the tailing console messages? The part just before it says going into Emergency mode. If nothing then we need to get into a webex and check the errors.
The issue here was that the node was in grub prompt. A reboot usually works or deleting the node and creating again will help if the config drive is not read properly.
I could not deploy OCP 4.6 in London06 zone. All nodes except bastion are in a reboot loop, failing to download ign file.