CliMA / ClimateMachine.jl

Climate Machine: an Earth System Model that automatically learns from data
https://clima.github.io/ClimateMachine.jl/latest/
Other
452 stars 78 forks source link

Parallel NetCDF support #2007

Open kpamnany opened 3 years ago

kpamnany commented 3 years ago

Description

We currently use NCDatasets.jl to write Diagnostics output. This package does not support parallel file I/O. We need to enhance it to do so, or develop a package to wrap PnetCDF.

ali-ramadhan commented 3 years ago

Reading the background section of PnetCDF it sounds like there are two backend options for parallel I/O: PnetCDF and parallel HDF5?

As a side note, I'm wondering whether the abysmal performance of I/O, and particularly NetCDF I/O, for 1,000+ ranks (e.g. NCAR's CM1: https://www2.mmm.ucar.edu/people/bryan/cm1/pp.html) is inevitable?

Seems like each rank has a pretty small chunk so maybe it's not a good benchmark to look at? I think they're doing parallel I/O into one file so I interpreted it to be the dominance of communication + I/O costs for 1,000+ ranks when you're model is rather small. Maybe it won't be as bad with GPUs since you tend to use far fewer ranks anyways.