Analysis-ready chunking of diagnostic output files

aekiss commented 3 months ago

Following from @Thomas-Moore-Creative's talk today, we should think about the NetCDF chunking we use to write to disk, so that the native chunking is OK for typical workflows.

Note that in a compressed, chunked NetCDF file, if you access any data in a chunk, the whole chunk needs to be read and uncompressed. So that can be a pitfall if the chunking doesn't match the access requirements, e.g. chunks are too big in the wrong dimensions. e.g. we had that problem with ERA5 forcing in ACCESS-OM2: https://github.com/COSIMA/access-om2/issues/242

Maybe we should set up a discussion/poll on the forum?

Thomas-Moore-Creative commented 3 months ago

@aekiss - after all my bluster about how important the choice of "native chunking" on the raw output is, what do we know about the limitations ( if any) on different models ability to control chunking of output at run-time? Where do modellers have that control in, say, MOM6? Is that dependent on / limited by how the model tiling is setup?

A recent conversation I had with @dougiesquire mused about choosing native chunking that was suited for and facilitated easier rechunking later. One of the problems that comes up is if you, for example, have very large chunks and are forced to load into memory most or all of the dataset to rechunk into another chunking arrangement.

That being said I'm not clear what the current COSIMA native chunking is and if it would need or benefit from change? ( other products I've come across very much do )

aekiss commented 3 months ago

Good questions.

In terms of output directly from the model components,

MOM6 chunking in x and y is controlled by IO_LAYOUT, which can differ from the tiling; there's 1 chunk in z and 1 chunk in time per file (I think)
CICE6 now supports a user-specified history_chunksize as of https://github.com/CICE-Consortium/CICE/pull/928 which controls chunking in x and y. There's no chunking in time, as CICE6 outputs a new file for each timestamp. @anton-seaice knows a lot more about it than me.

Model runs are broken into short segments to fit into queue limits (so segments are shortest at high resolution, e.g. a few months) so post-processing would be required to change the chunking in time.

aekiss commented 3 months ago

The other consideration is the impact of chunking on IO performance of the model itself (which can become a bottleneck at high resolution). There's a lot of discussion of this in https://gmd.copernicus.org/articles/13/1885/2020/

It would be nice if there was a compromise that worked well both for runtime performance and analysis, but maybe these are incompatible and raw model outputs would require post-processing to suit analysis.

anton-seaice commented 3 months ago

I believe MOM chunksizes are set in the fml namelist:

&fms2_io_nml
    ncchksz = 4194304
...

which is 4MB. I think part of the goal in having that size quite small is that it avoids splitting the chunks during analysis as much as practical (and some other reason about cache sizes I guess?)

It's hard to imagine model output having a chunksizes in time of anything other than 1. Like either it needs:

keeping the model output in memory for multiple time averages (maybe possible as we don't seem to be very memory limited),
or writing the chunks "out of order". e.g. if the time chunksize is 31, to write output at the end of each model day would need the model to only write to every 31st place in the output ... which sounds slow.

So I think its a question of how much extra time do we want running the model, vs how much extra time is it in analysis ?

angus-g commented 3 months ago

I think that is a poorly-named parameter that refers only to the internal library chunking (and maybe even only for NetCDF classic files, rather than the HDF5-backed NetCDF4 files). The per-dimension chunking is defined in netcdf var_def calls, and needs an array of chunksizes, rather than figuring it out from an overall chunk size. I think it is indeed the case that it depends on the IO_LAYOUT in the case of diagnostic output.

anton-seaice commented 3 months ago

Thanks Angus! We might need to revisit ncchksz which is more of a cache size when we tune the IO_LAYOUT. And that makes sense the chunksize related to IO_LAYOUT in x/y

COSIMA / access-om3

Analysis-ready chunking of diagnostic output files #203