SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
636 stars 283 forks source link

Parallel NetCDF input output #2566

Closed carlestena closed 6 years ago

carlestena commented 7 years ago

I think that iris has no parallel implementation for reading or writing NetCDF files. HDF5 supports MPI parallelization and could be a great idea to implement this under iris project.

marqh commented 7 years ago

hello @carlestena

i think that there are options on the way a HDF5 library is built that control this.

The testing uses a HDF5 build which is thread safe, and some parallelisation takes place. I don't think there are tests which use an MPI parallelisation

I wonder whether this is an option that would be enabled in the HDF build, that libraries such as Iris would just use.

Do you think that there is a part of the Iris API which should be able to control this, if it is available? It is not clear to me that this would be the case.

the h5py documentation seems to me to suggest that using mpi4py and h5py together. perhaps a similar pattern could be used to get benefit without requiring any change to the Iris library http://docs.h5py.org/en/latest/mpi.html