ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
218 stars 128 forks source link

Porting namelist_ExtremeEvents to v2 #906

Closed maritsandstad closed 5 years ago

maritsandstad commented 5 years ago

Issue to keep track of porting of extreme events (ETCCDI indices with plotting) to version 2

Duplicate of #270

maritsandstad commented 5 years ago

The branch is called version2_ExtremeEvents

maritsandstad commented 5 years ago

Is there some prepocessing that I can use to get my data into expected longitude latitude grid? At the moment I get an error because the grid of my observations data is -180 to 180

mattiarighi commented 5 years ago

This means that your observations are not CMOR compliant and need to be cmorized. The tool expects longitude in the 0:360 range and latitude from -90 to 90. Which dataset are you using?

maritsandstad commented 5 years ago

MERRA2

mattiarighi commented 5 years ago

Is it from obs4mips / ana4mips? Or did you get the data from another source?

maritsandstad commented 5 years ago

Not sure, you see I came back from parental leave and in the mean time my temp had written the diagnostic and added more data, and now I'm here trying to port it...

maritsandstad commented 5 years ago

(and also I'm an idiot)

mattiarighi commented 5 years ago

Can you just send me the recipe you are trying to run? We will find out which data are read in.

maritsandstad commented 5 years ago

Oh, I pushed it to the branch repo, it's called recipe_ExtremeEvents.yml (it probably won't work even if the MERRA data do, though...)

mattiarighi commented 5 years ago

If you have the data, try to test it with ERA-Interim only or use one of the CMIP5 models, until we get the cmorized data.

If you are familiar with NCL or python, you can contribute a cmorizer for MERRA2 (in a separate branch, please). For ERA-Interim such a script is already available, I can extend it to include the variables you need here.

maritsandstad commented 5 years ago

Ok, thanks a bunch for your answers! I can't really allocate lots of time to this porting project, and I seem to run into endless problems with the format of my datasets before I even get started on running my diagnostic. Now an error related to cube reduction

ValueError: Cubes were not reduced to one afterfixing: ...

Since I don't know really what these cubes are, I don't know how to change my data so I can just run through this...

Sorry for all the dumb questions...

mattiarighi commented 5 years ago

I would suggest to reduce your recipe to just 1 dataset (a model) and comment out the other datasets. Then we can try to isolate the problem.

maritsandstad commented 5 years ago

Thank you! I'll give that a go

bouweandela commented 5 years ago

These are iris cubes: https://scitools.org.uk/iris/docs/latest/userguide/loading_iris_cubes.html

This error occurs if there is more than one variable (or something that looks enough like it to be considered a Cube by iris) in the files your trying to load.

You can see if this is the case for yourself by starting python and doing something like

import iris
cubes = iris.load("/path/to/file.nc")
print(cubes)
print(f'There are {len(cubes)} cubes in this file')
maritsandstad commented 5 years ago

Ok, thanks! I've got latitudes and longitudes in addition to the variable. They are listed as dimensions in the netcdf, and it seems capable of collapsing the time dimension, so why not the other two?

maritsandstad commented 5 years ago

Yes, also I can't really run without the additional reanalysis datasets as my diagnostic script needs at least one to produce anything (basically it produces a Glecker plot comparing models to reanalysis...)

mattiarighi commented 5 years ago

You are using ERA-Inteirm and MERRA2. Which variables (mip) do you need? I can try to provide you with the data in the proper format.

maritsandstad commented 5 years ago

pr, tasmax, tasmin and tas, all daily

mattiarighi commented 5 years ago

ERA-Interim cmorized data for these variables are now available. If you are working on Jasmin @valeriupredoi can send you the path. Otherwise I can send you the files via cloud.

maritsandstad commented 5 years ago

Tha's great, thank you so much! I'm not working on Jasmin. I'll be happy to get them anyway you like. In principle I guess you can push them to the branch as well.

mattiarighi commented 5 years ago

We don't push data to the repository, that would be too much stuff. I will send you an email with a link to get the data.

bouweandela commented 5 years ago

Ok, thanks! I've got latitudes and longitudes in addition to the variable. They are listed as dimensions in the netcdf, and it seems capable of collapsing the time dimension, so why not the other two?

There's probably something wrong with they names of the coordinates or something. Are they according to the cf-conventions? http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#latitude-coordinate

maritsandstad commented 5 years ago

Thanks so much for the data! Now they pass nicely through the checks. However, I still can't really test my diagnostic, as I get Exec error. From the error it seems like the code might be trying to parse my R-diagnostic as python code... I'm not sure what to do about that...

File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_main.py", line 158, in main process_recipe(recipe_file=recipe, config_user=cfg) File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_main.py", line 208, in process_recipe recipe.run() File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_recipe.py", line 1050, in run self.tasks, max_parallel_tasks=self._cfg['max_parallel_tasks']) File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_task.py", line 578, in run_tasks _run_tasks_sequential(tasks) File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_task.py", line 589, in _run_tasks_sequential task.run() File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_task.py", line 224, in run self.output_files = self._run(input_files) File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_task.py", line 462, in _run process = self._start_diagnostic_script(cmd, env) File "/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/_task.py", line 416, in _start_diagnostic_script env=env) File "/site/opt/python/anaconda/envs/esmvaltool/lib/python3.7/subprocess.py", line 775, in init restore_signals, start_new_session) File "/site/opt/python/anaconda/envs/esmvaltool/lib/python3.7/subprocess.py", line 1522, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) OSError: [Errno 8] Exec format error: '/div/pdo/extreme/masan/V2/ESMValTool/esmvaltool/diagscripts/ExtremeEvents.r' 2019-03-18 16:17:47,697 UTC [98770] INFO If you suspect this is a bug or need help, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues and attach the run/recipe*.yml and run/main_log_debug.txt files from the output directory.

valeriupredoi commented 5 years ago

the call to subprocess does not mean it is trying to execute your R script as Python script, it is common to any subprocess launched from Python but that can execute any supported process (R, NCL etc). R scripts work from exactly that subprocess launcher but it looks like there is something that Python doesn't like about execution rather than running a process, so I'd do the following if I was you: change the extension to .R (not experienced with R scripts but I've seen the extension .R for the codes we have in ESMValTool, may not matter though); check if source('interface_data/r.interface') is not the offending line and if that script actually exists and is executable. Also it helps if you post the full error that you encounter from the log file and not from the wrapper

valeriupredoi commented 5 years ago

here is an example of R diag that @jhardenberg is working on, he knows well R (unlike me) so he can probably help more, but as I see in your diag, that interface_data dir seems not to have a properly set path to it (surely it isn't in the dir where the diagnostic is) https://github.com/ESMValGroup/ESMValTool/blob/REFACTORING_rainfarm/esmvaltool/diag_scripts/rainfarm/rainfarm.R

maritsandstad commented 5 years ago

Ok, thanks so much! That is very useful!

I guess I find it quite difficult to understand the error messages as there is so much wrapper on top of what is actually going on... Me being dumb though, mostly, I guess...

valeriupredoi commented 5 years ago

well, it's never easy with cross-platform stuff :grin: Do you have a diagnostic log file at all? It might be that the execution stopped right before writing anything to the diagnostic log so you are stuck with only what the task manager tells you. Anyways, I think here it's just a case that your interface_data dir is not found since there is no explicit path setter for it

maritsandstad commented 5 years ago

I think you are right. Probably tons of other things as well

BTW I doubt the .r extension is to blame for anything. All the magic scripst are .r ...

maritsandstad commented 5 years ago

So, my diagnostic depends on a couple of not so standard R-packages... Should I include their installation in the diagnostic script, or how should I best handle this so that the user gets a functioning version?

bouweandela commented 5 years ago

Pease add all required R packages (in alphabetical) order in the file esmvaltool/install/R/r_requirements.txt

jhardenberg commented 5 years ago

Hi folks, have you seen my comment in https://github.com/ESMValGroup/ESMValTool/issues/270#issuecomment-487995350 ? Basically I found out that removing the bounds info from the preprocessed files allows the climdex library to work correctly.

bouweandela commented 5 years ago

Closing this issue, it is a duplicate of #270.

maritsandstad commented 5 years ago

Now I am very confused. Should I consult a different branch? Is someone else doing the climdex porting (that would be wonderful really:-))? Do you want it to start over from scratch? Have you guys been working on some things to port this and then I should work from there, or just stay away?

jhardenberg commented 5 years ago

Hello Marit, actually I was just going to post here about work I did yesterday exactly on this. The fact that Bouwe just closed this issue is by chance ;)

As you will see in #270 there is a need for the ETCCDI calculation also in the MAGIC project and there has been some confusion in the past months on who was doing what. Actually things stopped already months ago once we realised that the current climdex library could not read correctly the files produced by the preprocessor (see my comment above). In any case the news is that in the past days I actually found a solution for this (see my last post in #270). Yesterday I worked on the code restarting exactly from your branch, which I copied into branch version2_ExtremeEvents_v2 and modified accordingly + other changes needed for making it work in ESMValTool v2

The current version of version2_ExtremeEvents_v2 actually runs correctly and computes all indices (plots are untested and still broken instead)

(The recipe is currently using one EC-Earth dataset which I had available, but I am testing on jasmin and will modify it to something that works also there).

As you mentioned, some additional packages have to be installed in the environment, I will document that in #270.

All in all indeed I think it is best we leave this issue close and continue discussion in #270