Open speedymiata opened 1 month ago
Its been a few days since I posted this help request. Is there another forum I should repost the request to?
Your config seems to be a little bit weird.
Do you really want to deploy the image via IPoIB?
ip=192.168.84.248
mac=ac:1f:6b:bc:db:ec
nicips.ib0=192.168.84.248
The MAC address seems to be from the ethernet device but you specify a IPoIB IP for the node.
ip
needs to be your eno1 IP and match the MAC address you specify, not ib0.
What does makedhcp -q cn01
show?
I do not want to deploy the image via IPoIB. That's the joy of an inherited system, right there - I want to use the Ethernet network for image deployment.
makedhcp -a cn01
shows:
[root@xcat_adm ~]# makedhcp -q cn01
cn01: ip-address = 192.168.84.248, hardware-address = ac:1f:6b:bc:db:ec
I also went ahead and executed chdef cn01 ip=192.168.36.248
to try to get the system to deploy the image over the Ethernet network, but this didn't have the desired effect. The node still hangs at the same point in the boot process. What else do I need to do, to switch to Ethernet?
Did you run nodeset/rinstall cn01 osimage=rhels8.9.0-x86_64-install-compute
afterwards?
Furthermore, you should make sure the nodes boots via ETH first or disable IB PXE ROM.
You may also want to disable DHCP for your IPoIB network with setting site.dhcpinterfaces
to your Mgmt. node ethernet interface.
I used rinstall
, yes, and I followed it up with xcatprobe osdeploy -n cn01
. I'm also using rcons
to manually select the Eth interface as the boot device - it is most certainly starting with it first.
After running rinstall
, the makedhcp
command's output still hasn't changed. It still shows the IB interface's IP of .84.248.
Oh sorry, yes you need to run makedhcp cn01
before.
Then makedhcp -q cn01
should show the correct IP.
This seems odd. After running makedhcp cn01
, re-running makedhcp -1 cn01
does not indicate that a change was made. The IB address is still present.
But according to my lsdef
for this node, I've set ip
to the Ethernet interface's address. Are there any other items I should check?
[root@xcat_adm ~]# lsdef cn01
Object name: cn01
arch=x86_64
bmc=192.168.36.48
cons=ipmi
consoleenabled=1
currchain=boot
currstate=install rhels8.6.0-x86_64-compute
getmac=ipmi
hostnames=cn01
installnic=ac:1f:6b:bc:db:ec
ip=192.168.36.248
mac=ac:1f:6b:bc:db:ec
mgt=ipmi
netboot=xnba
nicips.ib0=192.168.84.248
nicips.ipmi=192.168.36.48
nicips.eno1=192.168.36.248
nicnetworks.eno1=ipmi-net
nicnetworks.ib0=ib-net
nictypes.eno1=Ethernet
nictypes.ib0=InfiniBand
os=rhels8.6.0
postbootscripts=otherpkgs
postscripts=syslog,remoteshell,syncfiles
profile=compute
provmethod=rhels8.6.0-x86_64-install-compute-
serialport=1
serialspeed=115200
status=powering-on
Here's what happened while "playing" with the makedhcp
command after referencing the man page:
[root@wxcat_adm ~]# makedhcp -q cn01
cn01: ip-address = 192.168.84.248, hardware-address = ac:1f:6b:bc:db:ec
[root@wxcat_adm ~]# makedhcp -d cn01
[root@wxcat_adm ~]# makedhcp -q cn01
[root@wxcat_adm ~]# makedhcp -n cn01
Renamed existing dhcp configuration file to /etc/dhcp/dhcpd.conf.xcatbak
Warning: [wxcat_adm]: No dynamic range specified for 192.168.80.0. If hardware discovery is being used, a dynamic range is required.
[root@wxcat_adm ~]# makedhcp -q cn01
[root@wxcat_adm ~]# makedhcp cn01
[root@wxcat_adm ~]# makedhcp -q cn01
cn01: ip-address = 192.168.84.248, hardware-address = ac:1f:6b:bc:db:ec
I'll admit that I still have a lot to learn about xcat, but it still seems quite strange that its not "picking up" the IP address I've specified in the node definition. Is there something I have to refresh? Apply?
I'm trying to use xcat to deploy rhel 8.9 onto a compute node, but the compute node fails to finish booting at this point:
During this process, I see this on the xcat head node:
I still have a lot to learn about xcat, so I'll be extremely grateful for any and all help that's offered.
Additional information: