NCAS-CMS / cf-python

A CF-compliant Earth Science data analysis library
http://ncas-cms.github.io/cf-python
MIT License
119 stars 19 forks source link

Clarity in documentation and code around chunksizes #779

Open bnlawrence opened 3 months ago

bnlawrence commented 3 months ago
  1. If you search the document for chunksizes you get a lot of stuff about dask chunk sizes first. Can be confusing if you are interested in variable chunksizes.
  1. The method v.data.set_nc_hdf5_chunksizes actually sets the chunk shape, not the volume/size.

E.g. the word size is clearly about the volume, as can be seen from the documentation:

https://docs.unidata.ucar.edu/nug/current/netcdf_perf_chunking.html says "Currently the netCDF default chunk size is 4MiB The current default chunking strategy of the netCDF library is to balance access time alongany of a variable's dimensions, by using chunk shapes similar to the shape of the entire variable but small enough that the resulting chunk size is less than or equal to the default chunk size."