Closed valeriupredoi closed 2 years ago
@hahsan1 I have opened a pull request #2284 with the new feature included (allowing for using the validation diagnostic even when datasets have the same name ie different facets of the same model) - I would appreciate it if you had some time to test it and see if it solves your functional need and if you could also provide some feedback if there are any issues. Many thanks for working with us! :beer:
Hi @valeriupredoi, I tried running the recipe again with the updated scripts from pull request #2284 but I get this error: UnboundLocalError: local variable 'experiment' referenced before assignment
.
I have uploaded the recipe I used as well as the log files if that may be of help to diagnose the error.
I have one comment about the recipe file. I specified the same dataset name (i.e. GISS-nudge) for both the control and experiment models. I was wondering which one it will interpret to be the control and experiment?
hi @hahsan1 cheers for trying out the enhanced diagnostic! Note however that you are still using the old diagnostic and its components, you need to make sure that /people/ahsa361/.conda/envs/esmvaltool/lib/python3.8/site-packages/esmvaltool/diag_scripts/validation.py
and /people/ahsa361/.conda/envs/esmvaltool/lib/python3.8/site-packages/esmvaltool/diag_scripts/shared/_validation.py
are updated with the ones from the pull request, together with the config file as well; I would checkout that branch and do a pip install -e .[develop]
in the recently checked out repo if I was you, so you install the changes locally and override the conda env installed version. Let me know if I can assist with installing just that particular branch, your recipe looks to be OK and that will work fine with the new changes :beer:
I would like to report that it works! 🎉 As you pointed out, I needed to update the two scripts in the environment folder.
I have a minor suggestion. I was wondering if it would be a good idea to generalize the script so it could run more than one type of mip (e.g. Amon, AERmon) in the same recipe. As it stands, I believe we can only specify one type of mip per recipe.
I would like to report that it works! tada As you pointed out, I needed to update the two scripts in the environment folder.
Excellent news, cheers for updating and retesting :beer: - if you could approve the PR that would be fab :+1:
About mip's - you can specify however many/whatever mip you need to, it's mip per dataset, not mip per recipe, so in the dataset dictionary, just use the mip: key and request whichever mips you need for each dataset!
Ah so would the syntax be something like this in the recipe:
datasets:
- {dataset: GISS-nudge, project: CMIP6, mip: [Amon, AERmon], activity: AerChemMIP, institute: NASA-GISS, exp: reference, ensemble: r1i1p5f104, grid: gn, start_year: 2000, end_year: 2004}
Then under variables, I could specify both types, so for example, rlut and emiso2.
Hmm, it doesn't seem to work the way I suggested. I get this error:
yamale.yamale_error.YamaleError: Error validating data '/pic/projects/GCAM/Emissions-MIP/ESMValTool/esmvaltool/recipes/emissions_mip/Phase1a/global/recipe_validation_CMIP6_2.yml' with schema '/people/ahsa361/.conda/envs/esmvaltool_test/lib/python3.8/site-packages/esmvalcore/recipe_schema.yml'
datasets.0.mip: '['Amon', 'AERmon']' is not a str.
datasets.1.mip: '['Amon', 'AERmon']' is not a str.
This is what the recipe looks like, as attached. recipe_validation_CMIP6_2.txt
no, mip: [Amon, AERmon]
will not work - the only way that will work is when you use it to complement missing data from eg Amon with AERmon data, but in a side-by-side comparison of two identical models (or even different models) that will give you ambiguous at best results, if not wrong. Using one mip per dataset is the way with the validation recipe. But if you told me the used case for multiple mips you need running then I could tell you what to/how to do :beer:
I'm interested in both the Amon and AERmon mips. Having the script accommodate both simultaneously would be great! But all in all, it wouldn't be a big deal if I had two separate recipes.
right! you want to run identical analyses on multiple mips in parallel - afraid that's not possible this way, you can, however, extend the recipe with more diagnostics that do exactly that - list all your datasets (with all the different mips, but one mip per dataset) in one datasets
section, then list all the diagnostics you want to run, giving them relevant and different names, and including each of the control/experiment pair you want to compare in each of these diagnostics. Does that make sense?
I got it to work another way! I omitted the mip keys in the dataset section and added it to each variable in the diagnostic section, like this:
variables:
rlut:
preprocessor: pp_rad
mip: Amon
emiso2:
preprocessor: pp_rad
mip: AERmon
How do I approve the PR, do I simply leave a comment?
Thanks again for all the help!
oh yeah that works too, for different variables, well done :+1: Yes please, if you could just confirm there that it works for you, as comment on the PR that would be awesome, cheers :beer:
suggested in https://github.com/ESMValGroup/ESMValTool/issues/2270 - get the validation recipes to allow for the same model name for both EXPERIMENT and CONTROL, but obviously with different data parameters (experiment, ensemble, etc)