Closed ickc closed 1 year ago
Copied from email thread:
The minimal reproducible example would be
git clone [git@github.com](mailto:git@github.com):ickc/htcondor_ex.git
cd htcondor_ex
make download
cd examples/mpi-hello-world
condor_submit mpi.ini
tail -f mpi.out mpi.err
Which would emits the 2 errors I copied in the last email. The first error is about the openib interface. This is more of a question if the system have better interconnects available, if not then setting OMPI_MCA_btl_base_warn_component_unused=0 would remove that (as suggested by the message itself.) The other error is more cryptic and is related to shared memory limits.
Copied from email thread:
With some searches, it seems the 2nd error are also related to networking:
open-mpi/ompi#6084 which references openucx/ucx#3080 and openucx/ucx#3084
Also see https://docs.open-mpi.org/en/v5.0.x/tuning-apps/networking/ib-and-roce.html (which is the doc of OpenMPI 5, see https://www.open-mpi.org/faq/?category=building for OpenMPI 3). Probably the OpenMPI on the system is built with --with-ucx=...
So probably both errors are related to how the network is configured at Blackett and how to config the MPI to get the most performance from the hardwares.
Copied from email thread:
From Robert:
the mpi hello world seems to be working for me now. I still get the shmget errors, but also the hello world messages:
-bash-4.2$ cat mpi.out
[1689171404.032094] [wn5917090:121332:0] sys.c:618 UCX
ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed:
Operation not permitted, please check shared memory limits by 'ipcs -l'
[1689171404.033081] [wn5916340:101461:0] sys.c:618 UCX
ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed:
Operation not permitted, please check shared memory limits by 'ipcs -l'
[1689171404.061509] [wn5914340:195243:0] sys.c:618 UCX
ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed:
Operation not permitted, please check shared memory limits by 'ipcs -l'
[1689171404.062160] [wn5914340:195244:0] sys.c:618 UCX
ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed:
Operation not permitted, please check shared memory limits by 'ipcs -l'
Hello world from processor wn5914340.in.tier2.[hep.manchester.ac.uk](https://github.com/simonsobs-uk/data-centre-issue-tracker/issues/hep.manchester.ac.uk), rank
3 out of 4 processors
Hello world from processor wn5914340.in.tier2.[hep.manchester.ac.uk](https://github.com/simonsobs-uk/data-centre-issue-tracker/issues/hep.manchester.ac.uk), rank
0 out of 4 processors
Hello world from processor wn5916340.in.tier2.[hep.manchester.ac.uk](https://github.com/simonsobs-uk/data-centre-issue-tracker/issues/hep.manchester.ac.uk), rank
1 out of 4 processors
Hello world from processor wn5917090.in.tier2.[hep.manchester.ac.uk](https://github.com/simonsobs-uk/data-centre-issue-tracker/issues/hep.manchester.ac.uk), rank
2 out of 4 processors
Here's what I did: 1) cp /usr/share/doc/condor-9.0.17/examples/openmpiscript $HOME 2) add after line 59 (#MPDIR=/usr/lib64/openmpi) . /etc/profile.d/modules.sh module load mpi/openmpi3-x86_64 MPDIR=$MPI_HOME 3) change mpi.ini to point to $HOME/openmpiscript instead of ../../openmpiscript 4) submit job
The openmpiscript script in the github repo is old, so might not work properly any more. Step 2 is required because the condor configuration on the WNs is still pointing to openmpi 1.x. I think it's better to use the module than to rely on the configuration on the WNs. You could also try to load the mpich2 module there. I haven't tried it interactively yet, this will have to wait until next week.
Copied from email thread:
The warnings/errors above are not about not being able to run MPI jobs. I have a modified version of that script that works: https://github.com/ickc/htcondor_ex/blob/main/src/openmpiscript The errors and warnings come along side a successfully run MPI job. Both are related to network configuration. Copied from an earlier email:
The first error is about the openib interface. This is more of a question if the system have better interconnects available, if not then setting OMPI_MCA_btl_base_warn_component_unused=0 would remove that (as suggested by the message itself.) The other error is more cryptic and is related to shared memory limits.
With some searches, it seems the 2nd error are also related to networking: https://github.com/open-mpi/ompi/issues/6084%C2%A0which?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc references https://github.com/openucx/ucx/pull/3080%C2%A0and%C2%A0https://github.com/openucx/ucx/issues/3084 Also see https://docs.open-mpi.org/en/v5.0.x/tuning-apps/networking/ib-and-roce.html which is the doc of OpenMPI 5, see https://www.open-mpi.org/faq/?category=building for OpenMPI 3). Probably the OpenMPI on the system is built with
--with-ucx=...
So probably both errors are related to how the network is configured at Blackett and how to config the MPI to get the most performance from the hardwares.
The problem with the shared memory error is that ucx is trying to use kernel hugepages which requires elevated permissions, see https://github.com/openucx/ucx/issues/3023. The merge mentioned in the ticket fixes this by hiding the error if it's due to permission problems (EPERM), but the version available in C7 doesn't have this patch. I wonder if there's a way of disabling the relevant ucx module on the command line or with environment variables.
@rwf14f, thanks! I think we can safely ignore the 2nd error (which is more like a warning) for now as it is already fixed in newer versions.
How about the 1st warning? It is related to the available network interface. If there's no other available at Blackett then we can just set export OMPI_MCA_btl_base_warn_component_unused=0
.
OpenMPI tries all available communication modules to check what's available on a machine. The first warning is caused by the openib module which manages communication / transfers via Infiniband networks. As we do not have any Infiniband devices you get that warning. Either ignore it or disable it with the environment variable. Afaik, OpenMPI has options to explicitly tell it which modules it should try or not try. This would be another way of avoiding those warnings.
Thanks! For now I'm disabling the warning. I think having some sort of documentation on Blackett would be beneficial to resolve this kind of problems. c.f. #6.
Copied from email thread:
in running MPI jobs with Open MPI 3, I have the following warnings. Is it normal or was there any configuration problem?
From stderr:
From stdout: