Open hatlenheimdalthea opened 11 months ago
I think you need an additional check here:
def full_testbed_processing(ds: xr.Dataset) -> xr.Dataset:
ds = ds.squeeze(drop=True)
# select surface depth (for chl, TODO: Check if surface chlorophyll is available)
ds = ds.isel(lev=0).drop('lev')
ds = ds.sel(time=slice('1850', '2100'))
# testing
assert len(ds.time) == 3012
assert ds.time.data[0].year == 1850
# Processing
ds_regridded = regrid(ds)
ds_new_cal = replace_calendar(ds_regridded)
return ds_new_cal
this line:
ds = ds.isel(lev=0).drop('lev')
should probably be something like:
if 'lev' in ds.dims:
ds = ds.isel(lev=0).drop('lev')
then this should be applicable to both 3d variables (chl) and surface ones (e.g. sos
). Alternatively we could ingest the surface chlorophyll data (I think its chlos
, but please double check!) and remove that line alltogether.
I don't understand why everything ran smoothly before with the exact same code (see all 18 members here: path = 'gs://leap-persistent/hatlenheimdalthea/testing'). Have I accidentally deleted some code or something? Anyway, I tried both solutions and I get the same error:
AssertionError Traceback (most recent call last) Cell In[7], line 4 2 for k,ds in ddict.items(): 3 print(f"Processing {k}") ----> 4 ds_out = full_testbed_processing(ds) 6 ds_id = cmip6_dataset_id(ds_out, id_attrs=[ 7 'source_id', 8 'variant_label', (...) 11 'version', 12 ]) 13 save_path = f"gs://leap-scratch/jbusecke/pco2-testing/{ds_id}"
Cell In[6], line 34, in full_testbed_processing(ds) 31 ds = ds.sel(time=slice('1850', '2100')) 33 # testing ---> 34 assert len(ds.time) == 3012 35 assert ds.time.data[0].year == 1850 37 # Processing
AssertionError:
Seems like that dataset does not have the expected number of timesteps?
You could do something like:
for name, ds in ds_dict:
try:
full_testbed_processing()
...
except Exception as e:
print(f"{name} failed with {e}")
This would continue to process later datasets and then you get a printed list of the problematic datasets (which maybe you can fix).
Side note: Check if those particular runs are going beyond 2100! Then it would be as easy as adding a
ds = ds.sel(None,'2100')
To the 'full_testbed_processing' function.
notebook "regrid_members" in this repo
ValueError: Dimensions {'lev'} do not exist. Expected one or more of Frozen({'y': 291, 'x': 360, 'time': 1032, 'vertex': 4, 'bnds': 2})