AusClimateService / indices

Code for calculating climate indices
6 stars 6 forks source link

Cannot run on GCMs NorESM2-MM and CMCC-ESM2 #10

Closed ngben closed 1 year ago

ngben commented 1 year ago

I've started running icclim on the GCMs being downscaled and have run into issues with NorESM2-MM and CMCC-ESM2. Looks like its the same error for the two models. I have successfully run icclim on the other GCMs and will be adding these to ia39 test-dir shortly.

CMCC-ESM2:

Traceback (most recent call last):
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 536, in <module>
    main(args)
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 365, in main
    ds, cf_var = read_data(
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 283, in read_data
    ds = xr.open_mfdataset(infiles, chunks='auto', mask_and_scale=True)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 996, in open_mfdataset
    datasets = [open_(p, **open_kwargs) for p in paths]
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 996, in <listcomp>
    datasets = [open_(p, **open_kwargs) for p in paths]
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 545, in open_dataset
    ds = _dataset_from_backend_dataset(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 357, in _dataset_from_backend_dataset
    ds = _chunk_ds(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 325, in _chunk_ds
    var_chunks = _get_chunk(var, chunks)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/core/dataset.py", line 220, in _get_chunk
    chunk_shape = da.core.normalize_chunks(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/dask/array/core.py", line 3073, in normalize_chunks
    chunks = auto_chunks(chunks, shape, limit, dtype, previous_chunks)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/dask/array/core.py", line 3169, in auto_chunks
    raise NotImplementedError(
NotImplementedError: Can not use auto rechunking with object dtype. We are unable to estimate the size in bytes of object data

NorESM2-MM:

Traceback (most recent call last):
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 536, in <module>
    main(args)
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 365, in main
    ds, cf_var = read_data(
  File "/g/data/xv83/bxn599/ACS/icclim/run_icclim.py", line 283, in read_data
    ds = xr.open_mfdataset(infiles, chunks='auto', mask_and_scale=True)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 996, in open_mfdataset
    datasets = [open_(p, **open_kwargs) for p in paths]
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 996, in <listcomp>
    datasets = [open_(p, **open_kwargs) for p in paths]
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 545, in open_dataset
    ds = _dataset_from_backend_dataset(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 357, in _dataset_from_backend_dataset
    ds = _chunk_ds(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/backends/api.py", line 325, in _chunk_ds
    var_chunks = _get_chunk(var, chunks)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/xarray/core/dataset.py", line 220, in _get_chunk
    chunk_shape = da.core.normalize_chunks(
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/dask/array/core.py", line 3073, in normalize_chunks
    chunks = auto_chunks(chunks, shape, limit, dtype, previous_chunks)
  File "/g/data/xv83/dbi599/miniconda3/envs/icclim/lib/python3.10/site-packages/dask/array/core.py", line 3169, in auto_chunks
    raise NotImplementedError(
NotImplementedError: Can not use auto rechunking with object dtype. We are unable to estimate the size in bytes of object data
DamienIrving commented 1 year ago

Hmm, those modelling groups must have done something funny when creating their netCDF files so xarray (or dask under the hood) can't figure out the auto chunking.

I've fixed the issue but catching any NotImplementedError that is raised when opening the input files and in these rare cases the program will just open the files without auto chunking: https://github.com/AusClimateService/indices/commit/21375e0ceade64f876e0ad8213111c83839e399b

Without auto chunking there's a higher risk of getting memory errors, but since the CMIP models are relatively low resolution I reckon things will still work just fine.

ngben commented 1 year ago

Thank you Damien, that seems to have fixed it!