ExCALIBUR-NEPTUNE / NESO-Spack

Spack repository for installing NESO components and dependencies.
MIT License
2 stars 2 forks source link

Notify users of common enviroment configuration. #4

Open will-saunders-ukaea opened 2 years ago

will-saunders-ukaea commented 2 years ago

Issue

Somewhere highly visible to users we should inform (or detect bad) environment configuration, i.e.

Intel MPI + docker we know that something like:

I_MPI_FABRICS=shm mpirun -n <some-neso-exec>

Is probably required. Other known bad launch examples:

unset SYCL_DEVICE_TYPE/ONEAPI_DEVICE_SELECTOR/OMP_NUM_THREADS
mpirun -n <N> <some-neso-exec>

All of these (most likely) will launch with N MPI ranks all trying to use a thread per core - which leads to N times over-subscription.

Possible solutions

On install print example launch commands, e.g. for hipsycl

# mpich
OMP_NUM_THREADS=1 mpirun --bind-to core --map-by core -n <nproc> <neso-executable>

for intel

ONEAPI_DEVICE_SELECTOR=host mpirun -n <nproc> <neso-executable>
# if using docker (can we detect this?)
I_MPI_FABRICS=shm ONEAPI_DEVICE_SELECTOR=host mpirun -n <nproc> <neso-executable>

I expect that we will end up with a repository of example launch commands/configurations/submission scripts for different machines. These launch commands will become more complex with MPI+OpenMP (i.e. more than 1 thread) as the pinning/placement of the threads is often controlled through the mpiexec - and varies per implementation.

As Nektar++ has no threading we can assume no threading (for NESO). NESO-Particles can use threading - address this separately?

jwscook commented 2 years ago

A build time warning about I_MPI_FRABICS has been added in 68bff4644259da52236a00ba0a7135ce4b0f8cdf in https://github.com/ExCALIBUR-NEPTUNE/NESO-Spack/pull/3