xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
372 stars 173 forks source link

MAC address issue with Boot over Infiniband #5971

Open abhijitshintre11 opened 5 years ago

abhijitshintre11 commented 5 years ago

Hello, I have provisioned a Diskless node using Infiniband EDR. I am able to boot the node using xCAT-2.14.5 version, but I faced a problem with defining MAC address. The MAC address for Infiniband interconnect is 20-byte. When I run nodeset command it throws an error (Invalid mac address). So I need to manually add its MAC address in dhcpd.lease file. Is there any other way in xCAT to define MAC address which is greater than 6-byte.

Thanks

immarvin commented 5 years ago

hi @abhijitshintre11 , interesting that you make it work. Is the Infiniband EDR in "ethernet mode"? For you question, I think this is a new feature for xCAT which needs code changes, we will put this into our plan in the following sprints, we will update the status in this ticket so that you can trace.

We have several provision-over-IB attempts before but failed with some technical issues: 1) Infiniband EDR is not enabled in petitboot, so IB cannot interact with the deploy server with PXE 2) Infiniband EDR is not enabled in initrd by default, so the rootimg tarball cannot be downloaded over IB

I have several questions for you on this:

  1. What is the server type you provisioned? X86 or Power?
  2. Besides adding 20-byte MAC into DHCP lease manually, did you make other customizations? such as modify firmware settings to enable IB? install IB drivers and enable IB in diskless initrd?

It will be much appreciated if you can provide the steps or Doc on this, so that this can be a new feature or reference case in xCAT, thanks

abhijitshintre11 commented 5 years ago

Hello @immarvin, Here are the answers to your questions

  1. EDR is in Infiniband Mode.
  2. Server type is x86.
  3. Yes, we have given some additional parameters to work it with IB Interconnect.
  4. I have tried it with both pxe as well as xnba netboot method and its working fine.

Below are the steps Management node s/w configuration

  1. OS version: CentOS 7.4
  2. Mellanox (EDR ConnectX-4) OFED version: MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.4-x86_64
  3. Mellanox firmware version: 12.21.2010
  4. xCAT version: 2.14.5

Node configuration:

  1. Mellanox Firmware version: 12.17.2032
  2. Flexboot version: 3.4.903
  3. Disabled UEFI mode

Steps to be performed on the Management Node

  1. Install Mellanox OFED and Configure IPoIB on the management node.

  2. Define ib0 as your dhcpinterface

    chdef -t site dhcpinterfaces=ib0

  3. Generate diskless image with following parameters a. # genimage -i ib0 -n mlx5_ib,mlx4_ib,ib_ipoib centos7.4-x86_64-netboot-compute b. #packimage centos7.4-x86_64-netboot-compute

  4. Define node:

    mkdef -t node gpu3 groups=all,gpu arch=x86_64 cons=ipmi ip=162.20.1.160 mac=24:8a:07:a3:ee:2c mgt=ipmi netboot=xnba profile=compute

  5. Add kernel arguments to load the Infiniband drivers during netboot.

    chdef -t node -o gpu3 -p addkcmdline="bootdev=ib0 ksdevice=ib0 net.ifnames=0 biosdevname=0 rd.neednet=1 rd.bootif=0 rd.driver.pre=mlx5_ib,mlx4_ib,ib_ipoib ip=ib0:dhcp rd.net.dhcp.retry=10 rd.net.timeout.iflink=60 rd.net.timeout.ifup=80 rd.net.timeout.carrier=80"

  6. makehosts, makedns -n, makedhcp -n

  7. nodeset gpu3 osimage=centos7.4-x86_64-netboot-compute --noupdateinitrd

  8. After running nodeset command, dhcpd.lease file is generated. Need to modify the lease file with actual 20-byte MAC address of IB by adding a line below fixed address parameter, shown below fixed-address 162.20.1.160 option dhcp-client-identifier= ff:00:00:00:00:00:02:00:00:02:c9:00:24:8a:07:03:00:a3:ee:2c;

  9. Now need to restart dhcpd service to make the changes. Please note that the dhcpd.lease file will now get modified with the actual 20-byte MAC address.

  10. Boot the node with Flexboot as its first boot option.

Attached herewith configuration files for your reference.

Please let me know if this works fine in your scenario.

Thanks

BoIB_xcat.txt

immarvin commented 5 years ago

THANKS A LOT!! @abhijitshintre11

immarvin commented 5 years ago

one question, why the step Disabled UEFI mode is needed? is it mandatory? are UEFI and Flexboot exclusive options?

jjohnson42 commented 5 years ago

FYI, see my comments in https://github.com/xcat2/xcat2-task-management/issues/573

We did boot over infiniband using the 8 byte port guid and 6 byte 'fake ethernet' that mellanox does.

We did our testing using UEFI boot, but my understanding is the non-UEFI mode works in the same fashion.

Our instructions for OPA install are similar: https://hpc.lenovo.com/users/documentation/el7opainstall.html

But omnipath uses same hwaddr in pxe and os, and mellanox changes from 6 to 8 byte from firmware to OS, causing us to decide to sidestep with static address mode.

abhijitshintre11 commented 5 years ago

It's not mandatory to disable UEFI mode. To make it work in UEFI, need to burn appropriate UEFI firmware on IB card.

jjohnson42 commented 5 years ago

What do you think of the strategy of using the port guid rather than the 20 byte address?

To support that, all that's needed to accomodate omnipath is: https://github.com/xcat2/xcat-core/pull/5976/files

To support EDR IB, would need to either limit it to static addressing (already works today without changes to dhcp.pm) or extend dhcp.pm to put in multiple host declarations for the 6 and 8 byte forms to deal with difference between PXE and OS.

abhijitshintre11 commented 5 years ago

Hello, I need to define a node in xcat with two mac address so that if it doesn't boot with Infiniband it can boot with ethernet mac address?

viniciusferrao commented 1 year ago

Is there any modern solution for Infiniband booting? Without having to manually edit dhcpd.leases?

besawn commented 1 year ago

I have not been using boot over IB in any of my test environments.

@samveen @kcgthb @banuchka: Are any of you booting over IB in any of your environments? If so, do you have any advice you can share with @viniciusferrao and the rest of the community?

kcgthb commented 1 year ago

I unfortunately don't have experience with booting nodes over IB, and always try to ensure that systems will have a simple Ethernet management network they can boot from.

viniciusferrao commented 1 year ago

I unfortunately don't have experience with booting nodes over IB, and always try to ensure that systems will have a simple Ethernet management network they can boot from.

Yeah, I also require a Ethernet card on my projects. But you know, I didn't specified this machine.

Well I will for now edit the DHCP file directly. 😢

samveen commented 1 year ago

Me neither. Jarrod (@jjohnson42 ) would probably be the best person to answer (he's been building xCAT since before I've been using it).

viniciusferrao commented 1 year ago

@jjohnson42 will probably says it supports on Confluent! Kidding, but may be actually true... 😂