Loading datasets in `get_CMIP6_gridded_around_tgs.ipynb` causes kernel to crash when querying multiple variants

Timh37 / CMIP6cex

Repository for the cloud-based analysis of changes in compound extremes in CMIP6 simulations.

MIT License

5 stars 2 forks source link

Loading datasets in `get_CMIP6_gridded_around_tgs.ipynb` causes kernel to crash when querying multiple variants #1

Closed Timh37 closed 1 year ago

Timh37 commented 1 year ago

When querying multiple variants the kernel crashes at dsets_ = dask.compute(dict(dsets))[0]. However, when querying multiple CMIP6 models there seems to be no issues.

Timh37 commented 1 year ago

If this means that each variant needs to be stored in its own dataset that's fine, but I have not succeeded to run dask.compute on a dictionary with 3 levels instead of 2 as in the Pangeo Gallery examples.

jbusecke commented 1 year ago

The likely reason is that some variants have been run past 2100 (in some cases much longer!). Could you double check on this?

I can look into this in a few days I hope. Thanks for making progress here!

jbusecke commented 1 year ago

In fact, have you tried some of the xmip postprocessing functions? https://cmip6-preprocessing.readthedocs.io/en/latest/postprocessing.html#Merging-variables

This might help with this issue.

Ultimately I believe this is going back to chunking issues, with either the raw data (i have notices some of that, and we might need to reprocess them to get rid of that issue 'cleanly') or due to the concatenation (of e.g. longer and shorter members).

Timh37 commented 1 year ago

-tested using historical experiments only, which should be of the same length, but the issue persists. That suggests it's unrelated to the length of the simulations. -notebook using xmip handles the same data fine, so I will close this issue