JuliaIO / HDF5.jl

Save and load data in the HDF5 file format from Julia
https://juliaio.github.io/HDF5.jl
MIT License
383 stars 139 forks source link

Crash with system-provided OpenMPI and HDF5_jll v1.14 #1079

Open mfsch opened 1 year ago

mfsch commented 1 year ago

When I set up a simple project with the latest MPI and HDF5 packages and configure it to use the system-provided OpenMPI installation, the call to MPI.Init() crashes with “orte_init failed” errors. I am observing issue on both Ubuntu 18.04 (OpenMPI 3.1.2) and 20.04 (OpenMPI 4.0.3). Downgrading to HDF5_jll v1.12 fixes the issue.

Steps to reproduce:

On Ubuntu 18.04, the error includes the line mca_base_component_repository_open: unable to open mca_pmix_pmix3x: /home/user/.julia/artifacts/f9744710560ba3ddc00cd9df62ac7dfcd18c8649/lib/openmpi/mca_pmix_pmix3x.so: undefined symbol: opal_envar_t_class, in case this is helpful.

simonbyrne commented 1 year ago

ah, I've seen something similar! The problem appears to be that we're opening two different MPI libraries (the system one from MPI.jl, and the JLL one (from HDF5_jll).

Easy workarounds:

In the longer term we need a better fix. @giordano @eschnett any suggestions on how we can deal with this?

giordano commented 1 year ago

I thought HDF5_jll.jl would use the MPI library chosen by MPIPreferences.jl

simonbyrne commented 1 year ago

Yeah, i don't quite get why it's pulling in OpenMPI_jll?

simonbyrne commented 1 year ago

Ah, I see.

It augments based on the value of the MPI abi: https://github.com/JuliaBinaryWrappers/HDF5_jll.jl/blob/b96de8ada558f8d70e27b5561d4f5df815b01ebf/.pkg/platform_augmentation.jl#L13

But the augmentation for abi = "openmpi" always loads OpenMPI_jll: https://github.com/JuliaBinaryWrappers/HDF5_jll.jl/blob/main/src/wrappers/x86_64-linux-gnu-libgfortran5-cxx03-mpi%2Bopenmpi.jl#L9

eschnett commented 1 year ago

My approch, of course, would be to use the Julia-provided MPItrampoline as MPI implementation, and to use the system MPI via MPItrampoline...

JoshuaLampert commented 9 months ago

Would it be possible to print a warning if a system-provided MPI installation, but no system-provided HDF5 is detected?