Open NicholasRasi opened 4 years ago
Hello @NicholasRasi, thank you for opening this issue.
Is there any other way to mount the device without sudo? We are looking into this behavior and will let you know more as soon as possible.
If I use OpenMPI I need the SSH hook, am I right?
The error you are getting is related to the Bash syntax of your command. If I'm understanding things correctly, the $SLURM_PROCID
variable is not defined and the -eq
operator returns an error because it expects two operands. This happens because of sudo
, which by default does not preserve environment variables; to do so you should use the -E
option (see for reference the sudo manpage).
Also I believe that you are missing the -hostfile
option to mpirun
within the container, to inform the launcher of the available hosts.
More generally, it is not necessary to use the SSH hook in conjunction with OpenMPI. The cookbook page you are referring to shows how the SSH hook could be used to enable OpenMPI communication, but there are other possibilities.
As an example, if you want to run with the MPI stack from the container image, you could leverage the PMI2 process management interface, which Sarus is able to propagate into containers. You may find more information about this approach here.
Hello @Madeeks, thank for your reply.
-E
option and the host file. By the way, if I launch:
salloc -N 2 --cpus-per-task 60
srun sudo -E /opt/sarus/1.3.0-Release/bin/sarus run --ssh
--mount=src=/home/user,dst=/home/user,type=bind
--mount=src=/dev/infiniband/uverbs0,dst=/dev/infiniband/uverbs0,type=bind
nichr/hpc-bench:v2 echo $SLURM_PROCID
the execution stucks (while it does not without -E
).
On the other hand, I tried to run the following bash script:
#!/bin/bash
#SBATCH --job-name=osu_sarus
#SBATCH --nodes=2
#SBATCH --tasks-per-node=1
#SBATCH --time=00:10:00
#SBATCH --output=res_mpi.txt
#SBATCH --err=err_mpi.txt
#SBATCH --partition=hpc
module purge
module load mpi/openmpi
mpirun --map-by node -mca pml ucx --mca btl ^vader,tcp,openib -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_IB_PKEY=$UCX_IB_PKEY \
sudo /opt/sarus/1.3.0-Release/bin/sarus run \
--mount=src=/dev/infiniband/uverbs0,dst=/dev/infiniband/uverbs0,type=bind \
nichr/hpc-bench:v2 \
/opt/benchmarks/mpiBench/mpiBench -e 1K
The execution completed giving the following result:
$ cat res_mpi.txt
START mpiBench v1.5
0 : worker1
Barrier Bytes: 0 Iters: 1000 Avg: 0.0061 Min: 0.0061 Max: 0.0061 Comm: MPI_COMM_WORLD Ranks: 1
Bcast Bytes: 0 Iters: 1000 Avg: 0.0138 Min: 0.0138 Max: 0.0138 Comm: MPI_COMM_WORLD Ranks: 1
...
Allgatherv Bytes: 1024 Iters: 1000 Avg: 0.0156 Min: 0.0156 Max: 0.0156 Comm: MPI_COMM_WORLD Ranks: 1
START mpiBench v1.5
0 : worker2
Barrier Bytes: 0 Iters: 1000 Avg: 0.0062 Min: 0.0062 Max: 0.0062 Comm: MPI_COMM_WORLD Ranks: 1
Bcast Bytes: 0 Iters: 1000 Avg: 0.0338 Min: 0.0338 Max: 0.0338 Comm: MPI_COMM_WORLD Ranks: 1
Bcast Bytes: 1 Iters: 1000 Avg: 0.0339 Min: 0.0339 Max: 0.0339 Comm: MPI_COMM_WORLD Ranks: 1
...
Reduce Bytes: 1024 Iters: 1000 Avg: 0.0586 Min: 0.0586 Max: 0.0586 Comm: MPI_COMM_WORLD Ranks: 1
Message buffers (KB): 2
END mpiBench
Message buffers (KB): 2
END mpiBench
$ cat err_mpi.txt
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:
Local host: worker1
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4120
Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: No preset parameters were found for the device that Open MPI
detected:
Local host: worker2
Device name: mlx5_0
Device vendor ID: 0x02c9
Device vendor part ID: 4120
Default device parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_device_param_files MCA parameter to set values for your
device.
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_device_params_found to 0.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: worker1
Local device: mlx5_0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: worker2
Local device: mlx5_0
--------------------------------------------------------------------------
As far as I understand the workers do not communicate.
If I launch the application with srun
and -mpi=pmi2
salloc -N 2 --cpus-per-task 60
srun -N2 --mpi=pmi2 sudo /opt/sarus/1.3.0-Release/bin/sarus run \
--mount=src=/dev/infiniband/uverbs0,dst=/dev/infiniband/uverbs0,type=bind \
nichr/hpc-bench:v2 \
/opt/benchmarks/mpiBench/mpiBench -e 1K
I get a similar result.
I also ran a batch script with MVAPICH2 and the Sarus MPI hook
#!/bin/bash
#SBATCH --job-name=osu_sarus
#SBATCH --nodes=2
#SBATCH --tasks-per-node=1
#SBATCH --time=00:10:00
#SBATCH --output=res_mpi.txt
#SBATCH --err=err_mpi.txt
#SBATCH --partition=hpc
module purge
module load mpi/mvapich2
srun sarus run --mpi \
nichr/hpc-bench:v4 \
/opt/benchmarks/mpiBench/mpiBench -e 1K
I did not get any error but the workers are separated as in the previous result.
On my cluster I have MVAPICH2 2.3.4 while on the guide the recommended version is the MVAPICH2 2.2, do you think it can be a problem? Are the workers separated due to the launch of Sarus with sudo?
Thank you
Hello, I am trying to run some MPI benchmarks with Sarus containers. In particular I am using OpenMPI 4. Nodes are RDMA capable and have Infiniband. Everything works fine without the container and if I run
ibv_devinfo
on the host I got:But if I run it inside a container I got
Failed to open device
. So, I tried to mount the device with a bind but it does not work without sudo:On the other hand, it works with sudo and the device is recognized inside the container.
1. Is there any other way to mount the device without sudo?
The guide reports that I need to use the SSH hook in order to run OpenMPI. But if I launch sarus with sudo, mount and srun:
I got:
2. If I use OpenMPI I need the SSH hook, am I right?
I have created the container with the following Dockerfile:
I am new to Sarus and HPC world, thank you for your support!