idaholab / moose

Multiphysics Object Oriented Simulation Environment
https://www.mooseframework.org
GNU Lesser General Public License v2.1
1.7k stars 1.04k forks source link

Introduce OpenMPI Conda Packages #26839

Closed milljm closed 1 month ago

milljm commented 6 months ago

Reason

MPICH 4.1.x seems to be a no go for Apple Si. When we try to bump MPICH from 4.0.2 (working), to anything newer (4.1.x or higher), we see random and sometimes not-so-random hangs when running MOOSE based applications. I am not exactly sure what causes the hang, but we believe it is occurring in MUMPS.

Design

Adding OpenMPI as a possible solution; Core MOOSE developers are not sure how to make MPICH work. Whether or not this gets accepted, I want a PR Conda channel to play with.

Will need to figure out versioner to begin tracking another wrapper, and everything that might entail.

Possibly create a moose-mpi package instead, allowing for variants: conda install moose-dev openmpi would get you the OpenMPI stack variant, while conda install moose-dev mpich would get you MPICH.

The default (conda install moose-dev) would end up being whatever latest packages Conda finds (eventually resulting in only OpenMPI).

The variant idea is partially working when using the custom channel:

conda config --add channels https://conda.software.inl.gov/moose/dualmpi

Apple Si only at the moment

Impact

Switch MPI Wrapper from MPICH to OpenMPI

lindsayad commented 6 months ago

The hangs are definitely in MUMPS

milljm commented 6 months ago

I have my work cut out for me... This change will require a substantial re-write on our versioner.py tool. At a glance anyway.

milljm commented 6 months ago

Now to mess with Civet recipes live...