Open andersy005 opened 5 years ago
This is amazingly fast progress @andersy005! Great job!
FYI, I got the same warning from h5py with my conda-derived setup.
python setup.py configure --mpi --hdf5=$CONDA_PREFIX
Are you sure about this? Shouldn't --hdf5
point the hdf5 library you built in step 1, not the conda environment?
Are you sure about this? Shouldn't
--hdf5
point the hdf5 library you built in step 1, not the conda environment?
Not 100% sure. However, since I am specifying the installation point as $CONDA_PREFIX
in
$ CC=`which mpicc` ./configure --enable-unsupported --enable-parallel --enable-threadsafe --with-pthread=/glade/u/apps/ch/os/lib64/ --prefix=$CONDA_PREFIX
the built library ends up in $CONDA_PREFIX/lib
, and I assumed that this message:
abanihi@r3i1n25:~/devel/h5py> python setup.py configure --mpi --hdf5=$CONDA_PREFIX
running configure
Autodetected HDF5 1.10.4
********************************************************************************
Summary of the h5py configuration
Path to HDF5: '/glade/work/abanihi/softwares/miniconda3/envs/hdf5_zarr'
HDF5 Version: '1.10.4'
MPI Enabled: True
Rebuild Required: False
was a sign that h5py
install script was able to figure out the location of the built library.
The single node benchmark script just terminated with errors.
I am not sure how to proceed
Yeah so this is hard because you have to make sure that all your MPI libraries are compatible throughout your stack.
Maybe a red herring, but it looks from this line ../../ui/mpich/mpiexec.c:1147
that you are invoking mpiexec from mpich
, yet your are using impi
in your environment. Need to make sure the same MPI--ideally the one recommended for cheyenne--is used everywhere.
Surely there is someone at NCAR who can help sort this out, no?
../../ui/mpich/mpiexec.c:1147
I am glad you caught this one.
Surely there is someone at NCAR who can help sort this out, no?
I will see if I can get input from CISL Help desk. I am now going through the documentation to see if there's any mention of recommended compilers and/or tips on how to keep MPI libraries compatible throughout one's software stack.
@rabernat, I finally got hdf5
and h5py
to build correctly with the right set of compilers. The results can be found here: https://github.com/andersy005/zarr_hdf_benchmarks/blob/master/plot_all_results-build-from-source.ipynb
I did not observe that much difference from your results with conda-based libraries. Let me know if there are other hypothesis worth testing and I will test them.
Some time in March I will look into adding dask parallelism and compare performances.
We might want to look at building the necessary tools using the system libraries, not conda installs. It is unclear to me whether the conda MPI packages build with MPIO support, which is needed to take advantage of the parallel filesystem. My intuition tells me that without MPIO support (in the MPI library), you will not see scaling across multiple nodes.
We might want to look at building the necessary tools using the system libraries, not conda installs.
For this experiment, I did not use conda libraries. I built hdf5 MPI enabled, h5py, mpi4py against system libraries. Are there other tools I should be aware of that need to be built from source against system libraries?
Ah! I missed the non-use of conda. Sorry.
But, yet, I would say that you should NOT be building hdf5. There is a pre-build hdf5 on the system that can be loaded with the hdf5-mpi
module. I would try building with that...and try it with different system compilers (e.g., module load intel
, module load gnu
, module load pgi
).
And I'm not sure if the h5py package really implements the parallel-hdf5 layer properly. I've tested it in the past and not seen it scale. So, we should verify that the C library actually scales! (Perhaps first... Just so we know that the underlying layers are actually working!)
I did not observe that much difference from your results with conda-based libraries. Let me know if there are other hypothesis worth testing and I will test them.
That's not completely true! Look at the read and write performance for 72 cores and 4000000 byte chunks
operation | your version (native) | my version (conda) |
---|---|---|
hdf-read | 2112 | 3025 |
zarr-read | 3531 | 221 |
hdf-write | 5279 | 3273 |
zarr-write | 3163 | 234 |
The zarr performance increased by more than 10x.
However, it is still the case that we don't observe good scaling with the number of cores.
At this point, I would definitely ask someone from Cheyenne for some feedback.
The zarr performance increased by more than 10x.
@rabernat, can you speculate on what could be the reason for Zarr's performance increase? For zarr, I used a conda install.
At this point, I would definitely ask someone from Cheyenne for some feedback.
@kmpaul, could you advise whom to contact for some feedback?
It must have something to do with mpi4py working better when built against the system python. I added an MPI barrier at the end of each read block:
This causes execution to pause until all ranks have reached that point. This might happen faster using the native MPI.
Hi all, sorry to wander in here uninvited.
Am just curious, have you raised any issues in conda-forge about your findings? Expect there are other people making use of MPI on different clusters that would be interested to hear what you learned and willing to work with you to address any performance problems you have identified.
Support for MPI in the h5py
build is still very new. So wouldn't be surprised if there are a few things that need to get ironed out.
As @rabernat pointed out here, there's a performance hit when using pre-built libraries from
conda
on Cheyenne.I started looking into this. My plan was as following:
[x] Build hdf5 (v1.10.4) MPI enabled library
[x] Build mpi4py from source
[x] Build h5py against hdf5 MPI enabled library
[x] Attempt at running the benchmark scripts
The first step of building hdf5 (v1.10.4) MPI enabled library on Cheyenne has proven itself to be a cumbersome process :). Getting the right combination of compilers wasn't a trivial task. I successfully built
hdf5
with following modules:threadsafe
build option doesn't work right out of the box:As a result, I had to use
--enable-unsupported
to allow building the high-level libraryBut then we importing
h5py
, I get this warning:I am not sure about the ramifications of this when using
h5py
Ccing @kmpaul, @jukent, @jhamman