ComputeCanada / software-stack-config

8 stars 3 forks source link

Interconnect type detection is inaccurate #90

Closed dmagdavector closed 2 months ago

dmagdavector commented 2 months ago

In the SitePackage.lua file there is a check for a particular directory:

But the presence of that directory—which indicates that a particular kernel module is loaded—does not necessarily mean that the network is using Infiniband. That module (ib_core) is loaded by the Mellanox driver regardless of what the link layer type is, even if the Mellanox NICs are Ethernet:

# ibstatus   
Infiniband device 'mlx5_0' port 1 status:
    default gid:     fe80:0000:0000:0000:[redacted]
    base lid:    0x0
    sm lid:      0x0
    state:       4: ACTIVE
    phys state:  5: LinkUp
    rate:        50 Gb/sec (1X HDR)
    link_layer:  Ethernet
[…]

NVidia's documentation suggests a different way to check the network type:

# cat /sys/class/infiniband/mlx5_0/ports/1/link_layer
Ethernet
mboisson commented 2 months ago

Hum, that would not work on our clusters. The content and the existence of

/sys/class/infiniband/mlx5_0/ports/1/link_layer

on the login nodes is quite variable... I found one cluster that has Ethernet in that file, two that have InfiniBand, and one for which that file does not exist on login nodes, while all of them have InfiniBand.

mboisson commented 2 months ago

However, this function is only used to define RSNT_INTERCONNECT if and only if it is not already defined. If you define RSNT_INTERCONNECT=Ethernet before our profile scripts are sourced, it will honour the value that you have defined and not try to detect it.

mboisson commented 2 months ago

To account for multiple port cards, and different generations, we probably would need to do something like

grep InfiniBand /sys/class/infiniband/*/ports/*/link_layer

to figure out if there is InfiniBand on the node.