Open grondo opened 10 months ago
I'm not sure if this is relevant but on the fedora page for mpich, I found this:
This build also include support for using the 'module environment' to select which MPI implementation to use when multiple implementations are installed. If you want MPICH support to be automatically loaded, you need to install the mpich-autoload package.
I think we can reasonably close this now that we know this is caused by the upstream inclusion of the PSM3 device interacting badly with virtual network interfaces. Do you agree @grondo?
Just setting up a fedora39 builder for ci and everything works except for the MPI tests.
Details:
By default, even a singleton MPI hello test fails:
flux run
also fails with the same error.Googling turned up this link https://github.com/open-mpi/ompi/issues/11295#issuecomment-1384539750
which suggests setting
PSM3_DEVICES=self,shm
and/orPSM3_HAL=loopback
.PSM3_DEVICES=self,shm
,PSM3_DEVICES=self
andPSM3_DEVICES=shm
all work for the hello test:However, this is not sufficient to pass all the tests in the testsuite. For example
However, using
PSM3_HAL=loopback
allows all tests to pass.