judithberner / climpred_CESM1_S2S

MIT License
3 stars 1 forks source link

how to work with calendar #9

Closed aaronspring closed 2 years ago

aaronspring commented 3 years ago

Nice examples added @judithberner and @abjaye

xr.open_dataset() and xr.open_zarr() both default to use_cftime=False. when using use_cftime=True init will be in cftime already. http://xarray.pydata.org/en/stable/generated/xarray.open_dataset.html

For the summer school I envision that no more cftime conversion needs to be done by students and all zarrs are already stored in the standard or gregorian calendar like observations.

I think this would be a great intermediate step: zarr with gregorian calendar matching observations calendar.

As the raw S2S CESM data seems to be in noleap, does that mean that we cannot forecast Feb 29th in the verification? so just dropping feb 29 in the verification seems fine. But isn't there a gap then somewhere in the data regardless of what we do? Alternatively we could also convert verification to a noleap calendar and use the current zarrs with noleap calendar. I am still a bit confused...

http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#calendar

judithberner commented 3 years ago

AS" For the summer school I envision that no more cftime conversion needs to be done by students and all zarrs are already stored in the standard or gregorian calendar like observations." Yes, but we also want to keep the capability to go through all the steps, so that we can add new models (e.g. my simulations with stochastic parameterizations)

AS: " I think this would be a great intermediate step: zarr with gregorian calendar matching observations calendar." 100% agree.

AS: "As the raw S2S CESM data seems to be in noleap, does that mean that we cannot forecast Feb 29th in the verification? so just dropping feb 29 in the verification seems fine. But isn't there a gap then somewhere in the data regardless of what we do? Alternatively we could also convert verification to a noleap calendar and use the current zarrs with noleap calendar. I am still a bit confused..." I think dropping Feb 29 is a good idea for our application, but again - want to keep general enough that notebooks can be used for other purposes (e.g. seasonal, annual etc). Yes, there is a gap (we will wrongly assess e.g. day 2 lead skill for initializations on Feb 28. Let's just stay we have bigger issues in S2S than that, so not a problem.

aaronspring commented 3 years ago

My bigger question here is: which zarr to upload to the cloud and make available to ASP students? Any postprocessing with cftime or not? I am thinking about the daily output, not the weekly (which can be calculated from daily easily) Should they always run the following in each notebook? (no further postprocessing)

hinda2["init"] = [cftime.DatetimeProlepticGregorian(d.year, d.month, d.day) for d in hinda2["init"].values]
verif["time"] = [convert_to_cftime(t) for t in verif.time.values]

Or should the saved zarr have already done this cftime massaging. I doing think we are losing any general purposes here. On the other hand this is not a hard calculation, it is only about the metadata/coordinates. But I would like us to make your default not need these lines.

abjaye commented 3 years ago

I think it would be great if all the calendar work were done before we write to zarr. We can provide the notebooks that make the zarr files so they can see how to do it, but there is not a need that I can see to just have to run these lines every time if not necessary.