Open tatarsky opened 9 years ago
I am taking it by the silence nobody uses OpenMPI modules on the Hal cluster. I will be adding a latest version module as a non-default to be safe. But if I do not see "Yes, I use this module" in say a week or so, I will be removing the older modules.
Does this support interest anyone? https://www.open-mpi.org/faq/?category=buildcuda
My group uses userspace-installed MPI installed via the conda
package manager instead of root-level-installed MPI modules:
I do agree that having a functional MPI module would be a good idea. I have traditionally used MPICH, but there shouldn't be a big difference between the two except in the way they are executed (which we would probably want to briefly document).
I don't know of any software currently in use that utilizes the CUDA-aware MPI features you mention, though this looks interesting.
I'll build it for kicks. I don't think anyone is using OpenMPI from these modules.
If there is still interest in a newer OpenMPI module: We actually need a version > 1.10 for running CNTK. Best, Thomas
We use conda to install our own MPICH, though I think it may also have openmpi.
I was just compiling my own, but conda seems comfortable. Naive questions: By using conda would the install be restricted to Python or is it a general setup-tool?
Conda is mainly intended as a powerful python package distribution system, but a bunch of useful things like MPICH that are python independent can also be installed this way. Since we use tons of conda installable python told, installing MPICH in user space via conda is easy for our workflows.
Check out miniconda:
http://conda.pydata.org/miniconda.html
then, after adding it to your PATH,
conda install mpich
or
conda install openmpi
Note that you have to compile your MPI code against the same MPI library that conda installs, I think
Sounds great. I will give it a try. Thanks a lot John!
There are two OpenMPI modules on the system compared to the MPICH2 stack.
I believe most folks actually use the MPICH2 module for MPI or have their own OpenMPI build.
However, I would like to at least provide a recent module (stable is v1.10.0) for OpenMPI and then REMOVE these two dated modules from the system.
Does anyone care? And if you care because you are using the above modules, are you willing to work with me on testing said new module?
I will not alter either module during said efforts.