jbusecke / xMIP

Analysis ready CMIP6 data in python the easy way with pangeo tools.
https://cmip6-preprocessing.readthedocs.io/en/latest/?badge=latest
Apache License 2.0
193 stars 41 forks source link

CESM2 Members Not Combining Along 'time' #330

Open AbbySh opened 5 months ago

AbbySh commented 5 months ago

from xmip.postprocessing: concat_experiments, merge_variables are not working on CESM2 members. Example error/warning:

/srv/conda/envs/notebook/lib/python3.10/site-packages/xmip/postprocessing.py:157: UserWarning: CMIP.NCAR.CESM2.historical.r10i1p1f1.Omon.gn.none.dpco2 failed to combine with :cannot align objects with join='exact' where index/labels/sizes are not equal along these coordinates (dimensions): 'time' ('time',)
  warnings.warn(f"{cmip6_dataset_id(ds)} failed to combine with :{e}")

This causes historical not to combine with future (in our case, ssp245)

jbusecke commented 5 months ago

Thanks for using xMIP @AbbySh. Do you have a code snippet that I can use to produce this error? That will make it much easier to look into this.

AbbySh commented 5 months ago
from xmip.utils import google_cmip_col
from xmip.preprocessing import combined_preprocessing
from xmip.postprocessing import concat_experiments, merge_variables

cat = col.search(
    variable_id=['tos', 'sos', 'chl', 'mlotst', 'spco2', 'dpco2'],
    table_id='Omon', # monthly ocean output only
    experiment_id=['historical','ssp245'],
    require_all_on=['source_id', 'member_id', 'grid_label'] # this ensures that results will have all variables and experiments available
)

ddict = cat.to_dataset_dict(
    preprocess=combined_preprocessing,
    xarray_open_kwargs=dict(use_cftime=True),
    aggregate=False
)

ds = merge_variables(ddict)
ds = concat_experiments(ds)

This should reproduce it, let me know!

jbusecke commented 5 months ago

Taking a look now.

jbusecke commented 5 months ago

I was just able to run this without error on the LEAP-Pange hub ('pangeo/pangeo-notebook:2023.08.29'):

from xmip.utils import google_cmip_col
from xmip.preprocessing import combined_preprocessing
from xmip.postprocessing import concat_experiments, merge_variables

col = google_cmip_col()

cat = col.search(
    variable_id=['tos', 'sos', 'chl', 'mlotst', 'spco2', 'dpco2'],
    table_id='Omon', # monthly ocean output only
    experiment_id=['historical','ssp245'],
    require_all_on=['source_id', 'member_id', 'grid_label'] # this ensures that results will have all variables and experiments available
)

ddict = cat.to_dataset_dict(
    preprocess=combined_preprocessing,
    xarray_open_kwargs=dict(use_cftime=True),
    aggregate=False
)

ds = merge_variables(ddict)
ds = concat_experiments(ds)

The only change is the col = google_cmip_col() line. Is it possible that you had some broken collection object still in memory?

jbusecke commented 5 months ago

The resulting datasets look like the time was concatenated properly:

image