ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
224 stars 128 forks source link

Calculate obs climatologies on the fly to allow users outside JASMIN to run AutoAssess soilmoisture recipe #2309

Closed valeriupredoi closed 1 year ago

valeriupredoi commented 3 years ago

Files mentioned in this comment https://github.com/ESMValGroup/ESMValTool/pull/2296#issuecomment-916826024 with an upload strategy provided by Zenodo and mentioned by @bouweandela in this comment https://github.com/ESMValGroup/ESMValTool/pull/2296#issuecomment-916830927 - @alistairsellar the key question here is if we are allowed to do this, since these files are (mostly, if not all) Met Office property?

alistairsellar commented 3 years ago

I've taken a first look at these. Did all files in this list come from the Met Office? I believe that only a small fraction are used by ESMValTool so far - correct?

They fall into three categories as far as I can tell:

  1. Publicly available datasets (e.g. CERES-EBAF) that have been processed into seasonal climatologies, and already used by ESMValTool. For these I recommend that it might be better for the diagnostic to use the raw monthly data that is already in the data pool, and do the processing in ESMValTool. I have made a first attempt at doing this for the autoassess surfrad recipe but hit a problem - I'll come back to it.
  2. Publicly available datasets used by metrics that have not yet been ported to ESMValTool. I'd recommend that we ensure these follow the above approach to avoid dependence on new datasets.
  3. Other datasets. These will have to be examined one by one as metrics are ported.
valeriupredoi commented 3 years ago

OK thanks lots @alistairsellar for looking into this; I'll compile a detailed list and attach it here so we can check items as they go and upload them upon approval - how's that sound? :beer:

alistairsellar commented 3 years ago

I think it sounds good - do you mean a list of which of these datasets is actually being used?

valeriupredoi commented 3 years ago

yes, and by doing a (more or less) deep search I see only these to be needed by soilmoisture and permafrost:

-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_djf.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_jja.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_mam.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_son.nc
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_djf.pp
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_mam.pp
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_son.pp
valeriupredoi commented 3 years ago

yes, and by doing a (more or less) deep search I see only these to be needed by soilmoisture and permafrost:

-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_djf.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_jja.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_mam.nc
-rw-rw---- 1 valeriu gws_esmeval    4166393 Feb 20  2020 ecv_soil_moisture_son.nc
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_djf.pp
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_mam.pp
-rw-rw---- 1 valeriu gws_esmeval    3545396 Feb 20  2020 SWE_clm_son.pp

hi @alistairsellar do think I can upload those to Zenodo or are they restricted by some sort of DPR? :beer:

alistairsellar commented 3 years ago

Sorry for the delay. Looking at this now...

alistairsellar commented 3 years ago

First conclusion is that the SWE_clm*.pp files are not actually used by any recipes. They were intended to be used for the autoassess snow recipe, but that didn't make it into main (yet) because of an inconsistency in the results. So I think these files can be ignored / deleted.

alistairsellar commented 3 years ago

That just leaves the ecv_soilmoisture*.nc files. These are derived from ESA CCI soil moisture data. The terms of use for CCI soil moisture data does not explicitly mention derived data but does say "Data downloaded by the registered user can be used by the user and the associated organisation, no onward distribution is permitted." [my emphasis]

So it's not clear if we have the right to put derived data on Zenodo, so I would err on the side of caution and not do it. Instead I think it would be better to follow approach 1 in my list above. I had an initial go at that but didn't get it to work. I'll will try again next week and ask you for help if I get stuck again.

Is that OK?

valeriupredoi commented 3 years ago

OK sounds good, Ali! I am under other release-related snow anyway atm, so this will surely get bumped to 2.5.0 when we've had a good dust-off of it :beer:

valeriupredoi commented 2 years ago

@alistairsellar could you maybe open a draft PR with the code you've already written to do Option 1 from the list, then we can work on that together? :beer:

zklaus commented 2 years ago

Begs the question: How are the remaining four ecv_soil_moisture files derived? We have added/are in the process of adding a bunch of ESA CCI obs to the supported observations recently (@axel-lauer is the leader of that effort), so if this is a simple seasonal average or similar, perhaps this information, too, can just be calculated in the recipe?

alistairsellar commented 2 years ago

@zklaus That is exactly what I'm proposing as the solution for this issue. Sorry for my continuing crapness on this issue. It is on my radar but I've been firefighting other issues.

Thanks @valeriupredoi for checking it still works on Jasmin. I hope that 2.5 is the last time you need to do this.

valeriupredoi commented 2 years ago

can we all take a moment and acknowledge the quality of the word "crapness" please :rofl:

zklaus commented 2 years ago

No worries, @alistairsellar. Sounds like we are on a good trajectory.

alistairsellar commented 2 years ago

Changed title to reflect agreed solution. [In prep for fixing this during next week's workshop.]

((Strictly speaking should the issue title just describe the problem, and the solution is summarised in the PR?))

alistairsellar commented 2 years ago

@ehogan, @Jon-Lillis this will mirror some of what you did for radiation_budget AA diagnostic, so would be useful to have your review of this at some point

alistairsellar commented 2 years ago

Feature branch: autoassess_soilmoisture_fix_obs