Closed ChrisBarker-NOAA closed 2 years ago
This is complete with the latest available versions of model_catalogs
and LibGOODS
The chunksizes are for the most part intelligently chosen, example from TBOFS:
output git:(rotating) ncdump -h TBOFS_nowcast_20220823-20220824.nc | grep _Chunk
zeta:_ChunkSizes = 1, 290, 176 ;
ocean_time:_ChunkSizes = 512 ;
wetdry_mask_psi:_ChunkSizes = 1, 289, 175 ;
wetdry_mask_rho:_ChunkSizes = 1, 290, 176 ;
wetdry_mask_u:_ChunkSizes = 1, 290, 175 ;
wetdry_mask_v:_ChunkSizes = 1, 289, 176 ;
u:_ChunkSizes = 1, 11, 290, 175 ;
v:_ChunkSizes = 1, 11, 289, 176 ;
temp:_ChunkSizes = 1, 11, 290, 176 ;
salt:_ChunkSizes = 1, 11, 290, 176 ;
Uwind:_ChunkSizes = 1, 290, 176 ;
Vwind:_ChunkSizes = 1, 290, 176 ;
Dimension order is correct, for the most part. The dimension order is specified by the provider, and so far have been correct (t, z, y, x).
Compression: this is could be a PhD thesis level discussion. We don't currently support lossy compression schemes, and several of the newer fancier compression options are only available to netCDF clients with a relatively new version of HDF5 and netCDF4 libraries. xarray
supports compression options.
I'm trying to find the default compression settings, but there may be a follow-up ticket to this if explicit lossless compression schemes are desired and supported by clients.
In our case, we want the "best" lossless compression that is supported by the netCDF4 package as delivered by conda-forge -- I have no idea what those options might be.
but:
https://unidata.github.io/netcdf4-python/#efficient-compression-of-netcdf-variables
Indicates that zlib compression is always available, so maybe use that in any case?
We also could consider truncating some of the data for better compression, if that makes sense for any of our variables -- that would take some thought -- probably for another day.
This is here as a reminder:
We should make sure that the netcdf files created are reasonably optimal for use with GNOME. That means: