clustervision / trinityX

TrinityX is the new generation of ClusterVision's open-source HPC, A/I and cloudbursting platform. It is designed from the ground up to provide all services required in a modern HPC and A/I system, and to allow full customization of the installation.
GNU General Public License v3.0
67 stars 37 forks source link

iPXE boot - Luna2 Could not get install script [000] #402

Closed xdkreij closed 5 months ago

xdkreij commented 8 months ago

Problem description When attempting to boot a node with iPXE the following happens:

dhcpd.conf

#
# DHCP Server Configuration file.
# created by Luna
#
option domain-name "some.domain.com";
option luna-id code 129 = text;
option client-architecture code 93 = unsigned integer 16;
option time-servers 10.1.2.220;
option domain-name-servers 10.1.2.220;

omapi-port 7911;
omapi-key omapi_key;

key omapi_key {
    algorithm hmac-md5;
    secret <somekey>;
}

# how to get luna_ipxe.efi and luna_undionly.kpxe :
# git clone git://git.ipxe.org/ipxe.git
# cd ipxe/src
# make bin/undionly.kpxe
# cp bin/undionly.kpxe /var/lib/tftpboot/luna_undionly.kpxe
# make bin-x86_64-efi/ipxe.efi
# cp bin-x86_64-efi/ipxe.efi /var/lib/tftpboot/luna_ipxe.efi
#

subnet 10.1.5.0 netmask 255.255.255.0 {
    max-lease-time 28800;
    if exists user-class and option user-class = "iPXE" {
        filename "http://10.1.5.240:7051/boot";
    } else {
        if option client-architecture = 00:07 {
            filename "luna_ipxe.efi";
        } elsif option client-architecture = 00:0e {
        # OpenPower do not need binary to execute.
        # Petitboot will request for config
        } else {
            filename "luna_undionly.kpxe";
        }
    }
    range 10.1.5.20 10.1.5.250;
    next-server 10.1.5.240;
    option domain-name "some.domain.com";
    option luna-id "lunaclient";
}

What has been done

# luna group show compute
+-------------------------------------------------------------------------------+
|                                Group => compute                               |
+---------------------+---------------------------------------------------------+
| name                | compute                                                 |
| domain              | cluster                                                 |
| osimage             | compute                                                 |
| osimagetag          | default (default)                                       |
| interfaces          | interface = BOOTIF                                      |
|                     |   network = some.domain.com                       |
|                     | interface = BMC                                         |
|                     |   network = ipmi                                        |
| setupbmc            | True                                                    |
| bmcsetupname        | compute                                                 |
| unmanaged_bmc_users | None                                                    |
| netboot             | True                                                    |
| localinstall        | False (default)                                         |
| bootmenu            | False (default)                                         |
| roles               | None                                                    |
| prescript           | <empty> (default)                                       |
| partscript          | mount -t tmpfs tmpfs /sysroot                           |
| postscript          | echo 'tmpfs / tmpfs defaults 0 0' >> /sysroot/etc/fstab |
| provision_interface | BOOTIF (default)                                        |
| provision_method    | torrent (cluster)                                       |
| provision_fallback  | http (cluster)                                          |
| comment             | None                                                    |
+---------------------+---------------------------------------------------------+
# luna node list
+-------------------------------------------------------------------------------------------+
|                                         << Node >>                                        |
+---+---------+---------+---------+----------+----------+---------------------+-------------+
| # |   name  |  group  | osimage | setupbmc | bmcsetup |        status       | tpm_present |
+---+---------+---------+---------+----------+----------+---------------------+-------------+
| 1 | node001 | compute | compute |   True   | compute  |         None        |    False    |
| 2 | node002 | compute | compute |   True   | compute  |         None        |    False    |
| 3 | node003 | compute | compute |   True   | compute  |         None        |    False    |
| 4 | node004 | compute | compute |   True   | compute  |         None        |    False    |
| 5 | node005 | compute | compute |   True   | compute  | installer.discovery |    False    |
| 6 | node006 | compute | compute |   True   | compute  | installer.discovery |    False    |
+---+---------+---------+---------+----------+----------+---------------------+-------------+

Expected results

xdkreij commented 8 months ago

@msteggink Appreciate your support on this :-) Thanks in advance!

msteggink commented 8 months ago

Hi @xdkreij is this a new install or an existing?

The could not get install script [000] refers to an issue in the Luna2 client phase. Can you check which node has been selected (dhcpd DHCPOFFER or luna node show)? Can you ssh to the node in the installer phase?

Can you try to restart luna2-daemon (systemctl restart luna2-daemon).

aphmschonewille commented 8 months ago

Also, please provide the following output (where image is most likely 'compute'): lchroot <image> rpm -qa | grep luna2 exit

xdkreij commented 8 months ago

rpm -qa | grep luna2


# chroot compute/
# rpm -qa | grep luna2

luna2-client-2.0-13.noarch.x86_64```
xdkreij commented 8 months ago

Hi @xdkreij is this a new install or an existing?

The could not get install script [000] refers to an issue in the Luna2 client phase. Can you check which node has been selected (dhcpd DHCPOFFER or luna node show)? Can you ssh to the node in the installer phase?

Can you try to restart luna2-daemon (systemctl restart luna2-daemon).

Yes, it's a clean install of the controller, and the image (compute)

I've removed all default nodes, and had to fix node.py to be able to add a new one (seems to be an old luna bug) I'm currently testing with the newly created node, but here's the default output of luna node add -g $GROUP -if BOOTIF -M $MAC $NAME

# luna node show node001
+-----------------------------------------------------------------------------------------+
|                                     Node => node001                                     |
+---------------------+-------------------------------------------------------------------+
| name                | node001                                                           |
| hostname            | node001.some.domain.com                                     |
| group               | compute                                                           |
| osimage             | compute (compute)                                                 |
| osimagetag          | default (default)                                                 |
| interfaces          | interface = BOOTIF                                                |
|                     |   ipaddress = 10.1.5.1                                          |
|                     |   macaddress = 00:50:56:03:13:4a                                  |
|                     |   network = some.domain.com                                 |
|                     | interface = BMC                                                   |
|                     |   ipaddress = 10.148.0.1                                          |
|                     |   macaddress = None                                               |
|                     |   network = ipmi                                                  |
| status              | None                                                              |
| vendor              | None                                                              |
| assettag            | None                                                              |
| switch              | None                                                              |
| switchport          | None                                                              |
| setupbmc            | False (compute)                                                   |
| bmcsetup            | compute (compute)                                                 |
| unmanaged_bmc_users | None                                                              |
| netboot             | True (compute)                                                    |
| localinstall        | False (compute)                                                   |
| bootmenu            | False (compute)                                                   |
| roles               | None                                                              |
| service             | False                                                             |
| prescript           | <empty> (default)                                                 |
| partscript          | (compute) mount -t tmpfs tmpfs /sysroot                           |
| postscript          | (compute) echo 'tmpfs / tmpfs defaults 0 0' >> /sysroot/etc/fstab |
| provision_interface | BOOTIF (default)                                                  |
| provision_method    | torrent (cluster)                                                 |
| provision_fallback  | http (cluster)                                                    |
| tpm_uuid            | None                                                              |
| tpm_pubkey          | None                                                              |
| tpm_sha256          | None                                                              |
| comment             | None                                                              |
+---------------------+-------------------------------------------------------------------+
xdkreij commented 8 months ago

Even better, now that i've added the node statically at luna with the MAC address and IP, it doesn't boot ipxe at all :face_with_spiral_eyes:

image

edit: when changing dhcp config - to change the ip address to the internal node network on the filename and next-server, ipxe boots again nicely to the boot menu. However, with the defaults above, and Ask lune for a node name..., it still returns http://<ip>/boot/search/mac/<mac> network unreachable

I think i know why, because the network ip that is provided in the message isn't the same as the internal node network Is this a luna config? I would have expected that the attempt to install a node, would only have to reach it's own internal network, and not directly the other network that has been set on the controller as primary

xdkreij commented 8 months ago

Seems to work fine on both internal node network as well as the main controller network

 ss -tulpn | grep 7051
tcp   LISTEN 0      128           0.0.0.0:7051       0.0.0.0:*    users:(("nginx",pid=1760,fd=8),("nginx",pid=1759,fd=8),("nginx",pid=1758,fd=8),("nginx",pid=1757,fd=8),("nginx",pid=1755,fd=8))

When testing with curl..

curl http://10.1.5.240:7051/boot/search/mac/00:50:56:03:13:4a
#!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64
imgload kernel
imgargs kernel root=luna luna.bootproto=static luna.mac=00:50:56:03:13:4a luna.ip=10.1.5.1/24 luna.gw= luna.url=https://10.1.2.220:7050 luna.verifycert=False luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk
imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64
imgexec kernel
curl http://10.1.2.220:7051/boot/search/mac/00:50:56:03:13:4a
#!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64
imgload kernel
imgargs kernel root=luna luna.bootproto=static luna.mac=00:50:56:03:13:4a luna.ip=10.1.5.1/24 luna.gw= luna.url=https://10.1.2.220:7050 luna.verifycert=False luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk
imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64
imgexec kernel

regardless, iPXE can't find it..


http://10.1.5.240:7051/boot/mac/search/00:50:56:03:13:4a... No such file or directory (https://ipxe.org/2d0c613b)```
aphmschonewille commented 8 months ago

Hi there.i noticed two things:- http://10.1.2.220:7051/- luna.url=https://100.66.2.220:7050In a standard setup, these do not add up (as in: normally, the ipaddress is equal).could you give us the output of:- luna cluster- luna network list- luna network show thanks!-Antoine----- On Wednesday, 28 February 2024 12:02 clustervision/trinityX Wrote: -----

Seems to work fine on both internal node network as well as the main controller network ss -tulpn | grep 7051 tcp LISTEN 0 128 0.0.0.0:7051 0.0.0.0:* users:(("nginx",pid60,fd=8),("nginx",pid59,fd=8),("nginx",pid58,fd=8),("nginx",pid57,fd=8),("nginx",pid55,fd=8))

When testing with curl.. curl http://10.1.5.240:7051/boot/search/mac/00:50:56:03:13:4a

!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64 imgload kernel imgargs kernel root=luna luna.bootproto=static luna.mac�:50:56:03:13:4a luna.ip.1.5.1/24 luna.gw= luna.url=https://100.66.2.220:7050 luna.verifycert�lse luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64 imgexec kernel

curl http://10.1.2.220:7051/boot/search/mac/00:50:56:03:13:4a

!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64 imgload kernel imgargs kernel root=luna luna.bootproto=static luna.mac�:50:56:03:13:4a luna.ip0.66.5.1/24 luna.gw= luna.url=https://100.66.2.220:7050 luna.verifycert�lse luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64 imgexec kernel

regardless, iPXE can't find it.. http://10.1.5.240:7051/boot/mac/search/00:50:56:03:13:4a... No such file or directory (https://ipxe.org/2d0c613b)'''

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @github.com> [ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/clustervision/trinityX/issues/402#issuecomment-1969888807", "url": "https://github.com/clustervision/trinityX/issues/402#issuecomment-1969888807", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]


WebMail 4.5 (c) 2001-2020 -=- https://www.djov.demon.nl

xdkreij commented 8 months ago

Hi there.i noticed two things:- http://10.1.2.220:7051/- luna.url=https://100.66.2.220:7050In a standard setup, these do not add up (as in: normally, the ipaddress is equal).could you give us the output of:- luna cluster- luna network list- luna network show thanks!-Antoine----- On Wednesday, 28 February 2024 12:02 clustervision/trinityX Wrote: ----- Seems to work fine on both internal node network as well as the main controller network ss -tulpn | grep 7051 tcp LISTEN 0 128 0.0.0.0:7051 0.0.0.0:* users:(("nginx",pid60,fd=8),("nginx",pid59,fd=8),("nginx",pid58,fd=8),("nginx",pid57,fd=8),("nginx",pid55,fd=8)) When testing with curl.. curl http://10.1.5.240:7051/boot/search/mac/00:50:56:03:13:4a #!ipxe imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64 imgload kernel imgargs kernel root=luna luna.bootproto=static luna.mac�:50:56:03:13:4a luna.ip.1.5.1/24 luna.gw= luna.url=https://100.66.2.220:7050 luna.verifycert�lse luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64 imgexec kernel curl http://10.1.2.220:7051/boot/search/mac/00:50:56:03:13:4a #!ipxe imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64 imgload kernel imgargs kernel root=luna luna.bootproto=static luna.mac�:50:56:03:13:4a luna.ip0.66.5.1/24 luna.gw= luna.url=https://100.66.2.220:7050 luna.verifycert�lse luna.node=node001 luna.hostname=node001 luna.service=0 net.ifnames=0 biosdevname=0 initrd=initrd.img boot=ramdisk imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64 imgexec kernel regardless, iPXE can't find it.. http://10.1.5.240:7051/boot/mac/search/00:50:56:03:13:4a... No such file or directory (https://ipxe.org/2d0c613b)''' —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @github.com> [ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "#402 (comment)", "url": "#402 (comment)", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ] _____ WebMail 4.5 (c) 2001-2020 -=- https://www.djov.demon.nl

ignore the 100.66. ip prefix; its 10.1 and 10.2, I've changed it but it seems that copy/paste didn't go well ;-)

Regardless, I'll provide the output of luna cluster tomorrow as I'm currently without vpn to the test cluster :-)

aphmschonewille commented 8 months ago

Not sure if the problems persists, but there have been bug fixes since the reporting of this issue. Could you update Luna by running: ansible-playbook controller.yml --tags=luna within the the trinityx-combined/site directory?

If the problem is still there after the update, please let me know.

xdkreij commented 5 months ago

@aphmschonewille I've been very busy due to a numerous of reasons (priority for one regarding this project).. Therefore I wasn't able to continue with this -_-"

Regardless.. I may have found a clue...

So in the dhcp.conf, within the subnet 10.1.5.0 netmask 255.255.255.0 { part

next-server 10.1.2.220

Is directed towards the external (WWW facing) NIC; .. Yet however,... when configuring the next-server to be the same as the subnet 10.1.5.0 (node facing NIC) - so 10.1.5.240; iPXE works..

Up until the point that it tries to boot after 'Ask luna-server for a node name`

It redirects back to 10.1.2.220 which results in pretty much the same issue in the first place that occurred prior to changing the next-server facing the nodes NIC on the controller;

image

This is however where I'm stuck.... is the next-server supposed to face towards the WWW facing NIC or the Node (DHCP) facing NIC??

Any clue on why the original config of next-server 10.1.2.220 might not work? Even though its (both are) reachable via curl?

(next-server 10.1.5.240; iPXE works... luna boot option -ask luna for node name- fails....)

 curl http://10.1.5.240:7051/boot/search/mac/00:50:56:03:13:4a
#!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64
imgload kernel
imgargs kernel root=luna luna.bootproto=static luna.mac=00:50:56:03:13:4a luna.ip=10.1.5.20/24 luna.gw= luna.url=https://10.1.2.220:7050 luna.verifycert=False luna.node=node001 luna.hostname=node001 luna.service=0  initrd=initrd.img boot=ramdisk
imgfetch --name initrd.img http://100.66.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64
imgexec kernel
(next-server 10.1.2.220; iPXE fails....)

curl http://10.1.2.220:7051/boot/search/mac/00:50:56:03:13:4a
#!ipxe

imgfetch -n kernel http://10.1.2.220:7051/files/compute-1707223911-vmlinuz-4.18.0-372.26.1.el8_6.x86_64
imgload kernel
imgargs kernel root=luna luna.bootproto=static luna.mac=00:50:56:03:13:4a luna.ip=10.1.5.20/24 luna.gw= luna.url=https://10.1.2.220:7050 luna.verifycert=False luna.node=node001 luna.hostname=node001 luna.service=0  initrd=initrd.img boot=ramdisk
imgfetch --name initrd.img http://10.1.2.220:7051/files/compute-1707223911-initramfs-4.18.0-372.26.1.el8_6.x86_64
imgexec kernel

EDIT: Most likely due to the fact that the node itself only has one NIC - on 10.1.5.0 subnet; It doesn't know how to find the 10.1.2.0 subnet; Are nodes supposed to have access to both WWW facing NIC's as well as a Controller <---> Node NIC connection?

xdkreij commented 5 months ago

@aphmschonewille , @msteggink

An exact representation of what's going on...

The bootloader is referring back to the network that isn't reachable... is it possible to make sure everything is reachable via the Node network -only- ? (This includes changing the boot loader to include the correct address of the node facing NIC)

image

aphmschonewille commented 5 months ago

Hi,

i start to better understand your problem better. I think the ip addresses, internal and external of the controller are mixed up or only the controller's external ip was used. If this is the case, i am afraid the approach will not work (optimally) as you have noticed.

If i misunderstood, could you help me explaining how your setup looks like?:

      ___________                                 ___________   ___ node001
     /           \         +------------+        /           \ /
-----|  external |---------| controller |--------|  internal |----- node002
     |   net     |      ^  +------------+  ^     |    net    | \
     \___________/      |                  |     \___________/  --- node003
                     x.x.5.240?         x.x.x.x?

-A

xdkreij commented 5 months ago
      ___________                                          ___________   ___ node001
     /           \               +------------+           /           \ /
-----|  external |----ens192-----| controller |---ens256-----|  internal |----- node002
     |   net     |         ^     +------------+     ^     |    net    | \
     \___________/         |                        |     \___________/  --- node003
                        x.x.5.240?               x.x.x.x?

This is pretty much accurate. The nodes cannot reach ens192; (This is a similar set up by one of the Cluster Vision GPU Cluster projects in the past where I'm currently hired.)

xdkreij commented 5 months ago

@aphmschonewille - I've contacted our hosting provider to fix their routing (and add some ports).. to be continued :-)

(note... below is apparently easy to accomplish; Might be an idea to add the option to ansible when creating the images luna_undionly.kpxe and luna_ipxe.efi; (example - make bin-x86_64-efi/ipxe.efi DEBUG=tcp,ipv4 --> ansible-play ........ --tags=ipxe_debugging (or something) )

image

aphmschonewille commented 5 months ago

Hi @xdkreij,

Normally compute nodes do not have to contact ens192 of the controller, as all relevant services are provided on ens256. When you run ansible the first time, trix_ctrl1_hostip is being used to determine which interface (IP) is the internal one, and luna is configured to use that as nextserver in dhcp. No magic here, but if not set correctly, you may end up ising the wrong interface where you need to do routing etc to make it work. This however can be altered using luna-cli (check: luna cluster, and luna network). May be i misunderstood why the nodes need to reach ens192 (and further?), so please correct me if i'm wrong.

Building PXE kernels with debugging is something we can add as a flag, but in normal circumstances you probably never need it. I have never really used it (may be once trying to make https without cert work?) in many years of booting nodes. I put it on the agenda though.

thanks! -A

aphmschonewille commented 5 months ago

can you provide the output of

-A

xdkreij commented 5 months ago

Seems the PXE boot issue has been resolved by moving to a single interface within the 'all.yml' file. PXE now continues to boot (up to another compute image issue I'll have to deal with later). This was indeed a routing challenge. Ticket can be closed. Thanks for helping out!