Open aekiss opened 3 months ago
@aekiss - after all my bluster about how important the choice of "native chunking" on the raw output is, what do we know about the limitations ( if any) on different models ability to control chunking of output at run-time? Where do modellers have that control in, say, MOM6? Is that dependent on / limited by how the model tiling is setup?
A recent conversation I had with @dougiesquire mused about choosing native chunking that was suited for and facilitated easier rechunking later. One of the problems that comes up is if you, for example, have very large chunks and are forced to load into memory most or all of the dataset to rechunk into another chunking arrangement.
That being said I'm not clear what the current COSIMA native chunking is and if it would need or benefit from change? ( other products I've come across very much do )
Good questions.
In terms of output directly from the model components,
IO_LAYOUT
, which can differ from the tiling; there's 1 chunk in z and 1 chunk in time per file (I think)history_chunksize
as of https://github.com/CICE-Consortium/CICE/pull/928 which controls chunking in x and y. There's no chunking in time, as CICE6 outputs a new file for each timestamp. @anton-seaice knows a lot more about it than me.Model runs are broken into short segments to fit into queue limits (so segments are shortest at high resolution, e.g. a few months) so post-processing would be required to change the chunking in time.
The other consideration is the impact of chunking on IO performance of the model itself (which can become a bottleneck at high resolution). There's a lot of discussion of this in https://gmd.copernicus.org/articles/13/1885/2020/
It would be nice if there was a compromise that worked well both for runtime performance and analysis, but maybe these are incompatible and raw model outputs would require post-processing to suit analysis.
I believe MOM chunksizes are set in the fml namelist:
&fms2_io_nml
ncchksz = 4194304
...
which is 4MB. I think part of the goal in having that size quite small is that it avoids splitting the chunks during analysis as much as practical (and some other reason about cache sizes I guess?)
It's hard to imagine model output having a chunksizes in time of anything other than 1. Like either it needs:
So I think its a question of how much extra time do we want running the model, vs how much extra time is it in analysis ?
I think that is a poorly-named parameter that refers only to the internal library chunking (and maybe even only for NetCDF classic files, rather than the HDF5-backed NetCDF4 files). The per-dimension chunking is defined in netcdf var_def calls, and needs an array of chunksizes, rather than figuring it out from an overall chunk size. I think it is indeed the case that it depends on the IO_LAYOUT in the case of diagnostic output.
Thanks Angus! We might need to revisit ncchksz
which is more of a cache size when we tune the IO_LAYOUT. And that makes sense the chunksize related to IO_LAYOUT in x/y
Following from @Thomas-Moore-Creative's talk today, we should think about the NetCDF chunking we use to write to disk, so that the native chunking is OK for typical workflows.
Note that in a compressed, chunked NetCDF file, if you access any data in a chunk, the whole chunk needs to be read and uncompressed. So that can be a pitfall if the chunking doesn't match the access requirements, e.g. chunks are too big in the wrong dimensions. e.g. we had that problem with ERA5 forcing in ACCESS-OM2: https://github.com/COSIMA/access-om2/issues/242
Maybe we should set up a discussion/poll on the forum?
Related: