open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.16k stars 859 forks source link

Troubleshooting ib on Azure hc44rs #6723

Closed gcormier closed 5 years ago

gcormier commented 5 years ago

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

2.1.1

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Ubuntu 18.04LTS default repository

Please describe the system on which you are running

ibv_devinfo hca_id: mlx5_0 transport: InfiniBand (0) fw_ver: 16.23.1020 node_guid: 0015:5dff:fe33:ff5f sys_image_guid: 9803:9b03:000c:6d1e vendor_id: 0x02c9 vendor_part_id: 4120 hw_ver: 0x0 board_id: MT_0000000010 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 780 port_lmc: 0x00 link_layer: InfiniBand


-----------------------------

I'm trying to run a simple benchmark (IMB) to confirm that the new Hc44rs series is working well in Azure. Previously I was successful using the H16r series and the CentOS 7.6 HPC image, mpich and intel runtimes, which includes drivers for IB support, however, a bit finicky. The newer releases support SRIOV and should work across all MPI platforms.

Everything is fully templated using ansible and terraform so it is quite easy to reproduce and make changes.

The latest issue is that when I am launching something, I'm not getting any output.

1. I have edited `/usr/share/openmpi/mca-btl-openib-device-params.ini` to add support for this device. All I did was add 4120 to the part_id list.

[Mellanox ConnectX5] vendor_id = 0x2c9,0x5ad,0x66a,0x8f1,0x1708,0x03ba,0x15b3,0x119f vendor_part_id = 4119,4120,4121 use_eager_rdma = 1 mtu = 4096 max_inline_data = 256



I have pasted the output of this command

`mpirun -debug-daemons -d -mca plm_base_verbose 5 -mca oob_base_verbose 5 -N 44 -mca btl_base_verbose 1 ./IMB-MPI1 `

https://termbin.com/f5eh

I'm expecting to see output from running this, but see nothing. I'm trying on a single node to start, as previously I was able to get output from this situation. Note that if I spin up a regular instance type that does not have Infiniband, I get the output as expected.

 I'm fairly new to MPI so please excuse my level of knowledge!
gcormier commented 5 years ago

Packages I am installing on a new image

      - cmake
      - git
      - makedepf90
      - gfortran
      - gcc
      - libnetcdf-dev
      - libnetcdff-dev
      - netcdf-bin
      - openmpi-bin
      - openmpi-common
      - libopenmpi-dev
      - libhdf5-openmpi-dev
      - patch
      - htop
      - iptraf-ng

Compiling IMB

git clone https://github.com/intel/opa-mpi-apps/
cd opa-mpi-apps/MpiApps/apps/imb/src
make CC=mpicc
gcormier commented 5 years ago

Any thoughts or ideas? Do you need additional information?

It would appear the task is running, just no output.

image

hpc@hpc-fvcom-vm1:~/fvcom/_run$ mpirun --debug-daemons -np 4 ./IMB-MPI1
[hpc-fvcom-vm1:27694] [[56012,0],0] orted_cmd: received add_local_procs
  MPIR_being_debugged = 0
  MPIR_debug_state = 1
  MPIR_partial_attach_ok = 1
  MPIR_i_am_starter = 0
  MPIR_forward_output = 0
  MPIR_proctable_size = 4
  MPIR_proctable:
    (i, host, exe, pid) = (0, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 27699)
    (i, host, exe, pid) = (1, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 27700)
    (i, host, exe, pid) = (2, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 27701)
    (i, host, exe, pid) = (3, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 27704)
MPIR_executable_path: NULL
MPIR_server_arguments: NULL
gcormier commented 5 years ago

This works and displays output as expected.

mpirun --mca btl self,tcp ./IMB-MPI1

So it would appear I need to switch gears now to debugging IB functionality.

hpc@hpc-fvcom-vm1:~/fvcom/_run$ mpirun --debug-daemons -np 8 --host 10.10.1.4:4,10.10.1.5:4 --mca btl_openib_verbose 9 --mca btl self,vader,openib,tcp ./IMB-MPI1 
Daemon [[1802,0],1] checking in as pid 41834 on host hpc-fvcom-vm2
[hpc-fvcom-vm2:41834] [[1802,0],1] orted: up and running - waiting for commands!
[hpc-fvcom-vm1:45544] [[1802,0],0] orted_cmd: received add_local_procs
[hpc-fvcom-vm2:41834] [[1802,0],1] orted_cmd: received tree_spawn
[hpc-fvcom-vm2:41834] [[1802,0],1] orted_cmd: received add_local_procs
  MPIR_being_debugged = 0
  MPIR_debug_state = 1
  MPIR_partial_attach_ok = 1
  MPIR_i_am_starter = 0
  MPIR_forward_output = 0
  MPIR_proctable_size = 8
  MPIR_proctable:
    (i, host, exe, pid) = (0, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 45554)
    (i, host, exe, pid) = (1, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 45555)
    (i, host, exe, pid) = (2, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 45556)
    (i, host, exe, pid) = (3, hpc-fvcom-vm1, /home/hpc/fvcom/_run/./IMB-MPI1, 45559)
    (i, host, exe, pid) = (4, 10.10.1.5, /home/hpc/fvcom/_run/./IMB-MPI1, 41838)
    (i, host, exe, pid) = (5, 10.10.1.5, /home/hpc/fvcom/_run/./IMB-MPI1, 41839)
    (i, host, exe, pid) = (6, 10.10.1.5, /home/hpc/fvcom/_run/./IMB-MPI1, 41840)
    (i, host, exe, pid) = (7, 10.10.1.5, /home/hpc/fvcom/_run/./IMB-MPI1, 41843)
MPIR_executable_path: NULL
MPIR_server_arguments: NULL
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x0000, part ID 0
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: default
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],0][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],1][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],2][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm1][[1802,1],3][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],5][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],6][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],4][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:173:opal_btl_openib_ini_query] Querying INI files for vendor 0x02c9, part ID 4120
[hpc-fvcom-vm2][[1802,1],7][btl_openib_ini.c:192:opal_btl_openib_ini_query] Found corresponding INI values: Mellanox ConnectX5
*hangs here*
jsquyres commented 5 years ago

I'm afraid I know nothing about the Azure platform, but have you tried updating to Open MPI v4.x with Open UCX / the ucx PML (vs. the ob1 PML + the openib BTL)?

gcormier commented 5 years ago

It's something I can try to give a shot when I have a few cycles! I will report back.

jladd-mlnx commented 5 years ago

@gcormier Please use UCX PML on Azure HPC platforms. It is well tested and gives best performance.

gcormier commented 5 years ago

@jladd-mlnx I'll give it a shot. Sorry for the delay - IB devices weren't appearing on hc44rs, but I just tried provisioning one and it seems to be back to normal so I can spend some time on this now.

gcormier commented 5 years ago

Very nice! Working!

Some useful things below in case others stumble on this. Most can be found at https://github.com/gcormier/hpc-fvcom/tree/master/azure

Run MPI

mpirun -npernode 44 -mca pml ucx --mca btl ^vader,tcp,openib -x UCX_IB_PKEY=$UCX_IB_PKEY --hostfile ~/hosts IMB-MPI1 sendrecv

user_limits.sh

cat << EOF | sudo tee -a /etc/security/limits.conf
*               hard    memlock         unlimited
*               soft    memlock         unlimited
*               hard    nofile          65535
*               soft    nofile          65535
EOF

set_pkey.sh

#!/bin/bash          
high_key=`sort -r /sys/class/infiniband/mlx5_0/ports/1/pkeys/* | head -1`
modified_key=$(printf '0x%04X\n' "$((high_key ^ 0x8000))")

echo Setting UCX_IB_KEY to $modified_key
export UCX_IB_PKEY=$modified_key

echo Updating /etc/profile.d/ucx_pkey.sh
echo "export UCX_IB_PKEY=$modified_key" | sudo tee -a /etc/profile.d/ucx_pkey.sh
jladd-mlnx commented 5 years ago

@gcormier Great!!

jsquyres commented 5 years ago

@jladd-mlnx Should this be added to the OMPI FAQ as well?

gcormier commented 5 years ago

As someone who knows next to anything in this world, it would have been useful to have a "zero to hero" script that takes an fresh instances and runs pingpong over IB on Azure.

I would suggest such a script could be made by combining https://github.com/gcormier/hpc-fvcom/blob/master/azure/packer-ubuntu.sh as well as the tips above (with a few logouts required)