ESMValGroup / ESMValCore

ESMValCore: A community tool for pre-processing data from Earth system models in CMIP and running analysis scripts.
https://www.esmvaltool.org
Apache License 2.0
42 stars 38 forks source link

Problem preprocessing ACCESS1-3 and IPSL-CM5A-LR piControl data #58

Closed ValerioLembo closed 4 years ago

ValerioLembo commented 6 years ago

This issue is related to porting the diagnostic tool for thermodynamics into version 2.0 of ESMValTool. Progress is tracked in this branch.

EDIT: the recipe has been ported, but there are remaining issues with preprocessing piControl data for experiments ACCESS1-3 and IPSL-CM5A-LR.

mattiarighi commented 6 years ago

Thank you @ValerioLembo. Can you also please update #503, adding your name and the link to this issue where appropriate? :+1:

ValerioLembo commented 6 years ago

Done!

ValerioLembo commented 6 years ago

I have a few questions about the usage of the diagnostics in v. 2.0:

Thanks for helping!

mattiarighi commented 6 years ago

The paths are specified in config-user.yml, in combination with config-developer.yml. The latter, however, should not be changed, unless you want to add a directory structure (aka drs) which is not already defined. Have a look at this presentation for more details.

The new recipe format is quite flexible and allows to specify mip at different levels (either for the variable or for the dataset). Try to check the existing recipes and if you do not find a solution to your problem please open a dedicated issue.

The diagnostic specific settings can now be passed directly via the recipe, under the scripts dictionary. The presentation above shows some examples, or you can again check the existing recipes.

Feel free to ask again if you have more questions.

ValerioLembo commented 6 years ago

I managed to control the input/output directories through the config-user.xml.

As for the scripts dictionary, it seems to me from the examples that it does not allow to manage custom flags, such as local options (defined by the developer) e.g. those to include or exclude part of a code.

On top of everything, I do not understand why I cannot enter the main script; it reads the preamble, then goes directly to the end with no error. The only things that I find in the log.txt file are:

/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/iris/fileformats/grib/init.py:59: IrisDeprecation: The module iris.fileformats.grib is deprecated since v1.10. Please install the package 'iris_grib' package instead. "The module iris.fileformats.grib is deprecated since v1.10. " /home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: The mpl_toolkits.axes_grid module was deprecated in version 2.1. Use mpl_toolkits.axes_grid1 and mpl_toolkits.axisartist provies the same functionality instead. warnings.warn(message, mplDeprecation, stacklevel=1) Could not load xarray

I am sure that I am lacking understanding of some very trivial issue...

One last thing: I call the log file to write my instructions like this: logger = logging.getLogger(os.path.basename(__file__)) but it seems that I do not write anything, even if I call logger in the preamble...

mattiarighi commented 6 years ago

@valeriupredoi can you help?

bouweandela commented 6 years ago

I had a *.conf script where I used to put some options to be specified by the user. Is that still possible?

Yes, but please use a yaml file instead.

Regarding the installation issues, did you follow the instructions here? https://github.com/ESMValGroup/ESMValTool/blob/version2_development/README.md

Regarding the logging issue: by default the log level is set to info, so only messages that you write with logger.info("Some message"), logger.warning("Some message"), logger.error("Some message") are logged, not logger.debug("Some message").

As for the scripts dictionary, it seems to me from the examples that it does not allow to manage custom flags, such as local options (defined by the developer) e.g. those to include or exclude part of a code.

You can add any custom key/value pairs to the scripts section and they will be passed on to your diagnostic, see e.g. https://github.com/ESMValGroup/ESMValTool/blob/version2_development/esmvaltool/recipes/examples/recipe_python.yml Only the script: my_diagnostic.py key/value pair is mandatory.

Make sure to start your diagnostic with the function esmvaltool.diag_scripts.shared.run_diagnostic so it sets up logging and picks up the configuration.

ValerioLembo commented 6 years ago

Regarding the installation issues, did you follow the instructions here? https://github.com/ESMValGroup/ESMValTool/blob/version2_development/README.md

Actually I followed the instructions on the Technical Overview linked from the webpage for porting to Version 2.0.

Make sure to start your diagnostic with the function esmvaltool.diag_scripts.shared.run_diagnostic so it sets up logging and picks up the configuration.

Apparently, this what the reason why I was not entering the diagnostics. Now I am finally hands on the code.

bouweandela commented 6 years ago

Those installation instructions do not look very accurate, I have opened an issue for it https://github.com/ESMValGroup/ESMValTool/issues/653. The information in that install.rst document was somewhat inaccurate: I'm not sure what the current status is, but we have an open issue about that too https://github.com/ESMValGroup/ESMValTool/issues/446.

ValerioLembo commented 6 years ago

I have one more question and a possible bug in the preprocessor to report.

The question is: is there a way to upload one land-sea mask for every model, rather than masking each initial fields. I would rather use of the path to the land-sea masks for each model I analyse in the diagnostics.

In v. 1.0 I used the commands:

`for model in project_info['MODELS']: 
currProject = getattr(projects, model.split_entries()[0])() 
currProject.get_cf_lmaskfile(project_info,model)`

Is there something equivalent in v. 2.0?

As for the possible bug, I preprocess an MPI-ESM-LR fields of tas, just by subsetting 5 years from 150. This is what I get if I type cdo sinfov on the file:

`File format : NetCDF4
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : MPIMET   MPI-ESM-LR v instant       1   1     18432   1  F32  : tas           
     2 : MPIMET   MPI-ESM-LR v instant       1   1         1   2  F64  : year          
   Grid coordinates :
     1 : gaussian                 : points=18432 (192x96)  np=48
                              lon : 0 to 358.125 by 1.875 degrees_east  circular
                              lat : -88.57217 to 88.57217 degrees_north
                        available : cellbounds
     2 : generic                  : points=1
   Vertical coordinates :
     1 : height                   : levels=1  scalar
                           height : 2 m
   Time coordinate :  72 steps
     RefTime =  1950-01-01 00:00:00  Units = days  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  1990-01-16 12:00:00  1990-02-15 00:00:00  1990-03-16 12:00:00  1990-04-16 00:00:00
  1990-05-16 12:00:00  1990-06-16 00:00:00  1990-07-16 12:00:00  1990-08-16 12:00:00
  1990-09-16 00:00:00  1990-10-16 12:00:00  1990-11-16 00:00:00  1990-12-16 12:00:00
  1991-01-16 12:00:00  1991-02-15 00:00:00  1991-03-16 12:00:00  1991-04-16 00:00:00
  1991-05-16 12:00:00  1991-06-16 00:00:00  1991-07-16 12:00:00  1991-08-16 12:00:00
  1991-09-16 00:00:00  1991-10-16 12:00:00  1991-11-16 00:00:00  1991-12-16 12:00:00
  1992-01-16 12:00:00  1992-02-15 12:00:00  1992-03-16 12:00:00  1992-04-16 00:00:00
  1992-05-16 12:00:00  1992-06-16 00:00:00  1992-07-16 12:00:00  1992-08-16 12:00:00
  1992-09-16 00:00:00  1992-10-16 12:00:00  1992-11-16 00:00:00  1992-12-16 12:00:00
  1993-01-16 12:00:00  1993-02-15 00:00:00  1993-03-16 12:00:00  1993-04-16 00:00:00
  1993-05-16 12:00:00  1993-06-16 00:00:00  1993-07-16 12:00:00  1993-08-16 12:00:00
  1993-09-16 00:00:00  1993-10-16 12:00:00  1993-11-16 00:00:00  1993-12-16 12:00:00
  1994-01-16 12:00:00  1994-02-15 00:00:00  1994-03-16 12:00:00  1994-04-16 00:00:00
  1994-05-16 12:00:00  1994-06-16 00:00:00  1994-07-16 12:00:00  1994-08-16 12:00:00
  1994-09-16 00:00:00  1994-10-16 12:00:00  1994-11-16 00:00:00  1994-12-16 12:00:00
  1995-01-16 12:00:00  1995-02-15 00:00:00  1995-03-16 12:00:00  1995-04-16 00:00:00
  1995-05-16 12:00:00  1995-06-16 00:00:00  1995-07-16 12:00:00  1995-08-16 12:00:00
  1995-09-16 00:00:00  1995-10-16 12:00:00  1995-11-16 00:00:00  1995-12-16 12:00:00
cdo sinfon: Processed 2 variables over 72 timesteps [0.02s 20MB]

whereas the same preprocessor on thets` field gives:

  File format : NetCDF4
    -1 : Institut Source   T Steptype Levels Num    Points Num Dtype : Parameter name
     1 : MPIMET   MPI-ESM-LR v instant       1   1     18432   1  F32  : ts            
   Grid coordinates :
     1 : gaussian                 : points=18432 (192x96)  np=48
                              lon : 0 to 358.125 by 1.875 degrees_east  circular
                              lat : -88.57217 to 88.57217 degrees_north
                        available : cellbounds
   Vertical coordinates :
     1 : surface                  : levels=1
   Time coordinate :  72 steps
     RefTime =  1950-01-01 00:00:00  Units = days  Calendar = proleptic_gregorian  Bounds = true
  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
  1990-01-16 12:00:00  1990-02-15 00:00:00  1990-03-16 12:00:00  1990-04-16 00:00:00
  1990-05-16 12:00:00  1990-06-16 00:00:00  1990-07-16 12:00:00  1990-08-16 12:00:00
  1990-09-16 00:00:00  1990-10-16 12:00:00  1990-11-16 00:00:00  1990-12-16 12:00:00
  1991-01-16 12:00:00  1991-02-15 00:00:00  1991-03-16 12:00:00  1991-04-16 00:00:00
  1991-05-16 12:00:00  1991-06-16 00:00:00  1991-07-16 12:00:00  1991-08-16 12:00:00
  1991-09-16 00:00:00  1991-10-16 12:00:00  1991-11-16 00:00:00  1991-12-16 12:00:00
  1992-01-16 12:00:00  1992-02-15 12:00:00  1992-03-16 12:00:00  1992-04-16 00:00:00
  1992-05-16 12:00:00  1992-06-16 00:00:00  1992-07-16 12:00:00  1992-08-16 12:00:00
  1992-09-16 00:00:00  1992-10-16 12:00:00  1992-11-16 00:00:00  1992-12-16 12:00:00
  1993-01-16 12:00:00  1993-02-15 00:00:00  1993-03-16 12:00:00  1993-04-16 00:00:00
  1993-05-16 12:00:00  1993-06-16 00:00:00  1993-07-16 12:00:00  1993-08-16 12:00:00
  1993-09-16 00:00:00  1993-10-16 12:00:00  1993-11-16 00:00:00  1993-12-16 12:00:00
  1994-01-16 12:00:00  1994-02-15 00:00:00  1994-03-16 12:00:00  1994-04-16 00:00:00
  1994-05-16 12:00:00  1994-06-16 00:00:00  1994-07-16 12:00:00  1994-08-16 12:00:00
  1994-09-16 00:00:00  1994-10-16 12:00:00  1994-11-16 00:00:00  1994-12-16 12:00:00
  1995-01-16 12:00:00  1995-02-15 00:00:00  1995-03-16 12:00:00  1995-04-16 00:00:00
  1995-05-16 12:00:00  1995-06-16 00:00:00  1995-07-16 12:00:00  1995-08-16 12:00:00
  1995-09-16 00:00:00  1995-10-16 12:00:00  1995-11-16 00:00:00  1995-12-16 12:00:00
cdo sinfon: Processed 1 variable over 72 timesteps [0.05s 20MB]

So apparently a variable called year emerges, that was not included before preprocessing, appears in the file for tas. This is not the case for ts. It happens regardless of the model. Original fields come from DKRZ ESGF node.

When I try to combine the two files (tas and ts) with a cdo command, it consider the two files as having different grids.

Does anyone know why this happens?

ValerioLembo commented 5 years ago

Dear all,

I have some troubles testing the diagnostic with a few CMIP5 models in piControl scenario.

It seems those problems are related to the way time coordinates are treated.

One model is IPSL-CM5A-LR. I got this error in the preprocessing phase:

Traceback (most recent call last): File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 215, in run conf = main(args) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 143, in main process_recipe(recipe_file=recipe, config_user=cfg) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 193, in process_recipe recipe.run() File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_recipe.py", line 1027, in run self.tasks, max_parallel_tasks=self._cfg['max_parallel_tasks']) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_task.py", line 485, in run_tasks _run_tasks_parallel(tasks, max_parallel_tasks) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_task.py", line 530, in _run_tasks_parallel task.output_files = result.get() File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/multiprocessing/pool.py", line 572, in get raise self._value ValueError: Time units with interval of "months", "years" (or singular of these) cannot be processed, got 'months'.

The other models are ACCESS1-0 and ACCESS1-3, and specifically the preprocessing of their daily data. In this case I get this error:

2018-12-02 15:23:46,962 UTC [45168] ERROR Program terminated abnormally, see stack trace below for more information Traceback (most recent call last): File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 215, in run conf = main(args) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 143, in main process_recipe(recipe_file=recipe, config_user=cfg) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_main.py", line 193, in process_recipe recipe.run() File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_recipe.py", line 1027, in run self.tasks, max_parallel_tasks=self._cfg['max_parallel_tasks']) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_task.py", line 485, in run_tasks _run_tasks_parallel(tasks, max_parallel_tasks) File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/site-packages/ESMValTool-2.0a1-py2.7.egg/esmvaltool/_task.py", line 530, in _run_tasks_parallel task.output_files = result.get() File "/home/zmaw/u234097/conda-envs/esmvaltool/lib/python2.7/multiprocessing/pool.py", line 572, in get raise self._value ValueError: day is out of range for month

As far as I understood from other issues, some fix has to be applied, which has to be specific of a model (but does it depend on the experiment as well? I am using ACCESS1-x and IPSL-CM5A-LR in historical and rcp85 settings). Is there a protocol for applying these changes? I guess these should be made shared through all branches, once applied and effective...

ValerioLembo commented 5 years ago

@jvegasbsc can maybe help in solving this issue?

valeriupredoi commented 5 years ago

@ValerioLembo I am a bit confused about these errors: one thing I can tell you is that daily data is not fully supported (I have never tried to run with daily data myself) - as to monthly data issues they shouldn't happen if you use the latest version of CMIP5 files, I don't have access to the DKRZ ESGF node but it should be fully mirrored to the BADC one so if you can post or link the recipe then I can have a quick test using BADC data on CEDA-Jasmin

ValerioLembo commented 5 years ago

@valeriupredoi here is the recipe. I commented all the models and scenarios except the ones giving me troubles... Thanks for trying!

valeriupredoi commented 5 years ago

cheers @ValerioLembo - here is the full stack error:

2018-12-03 17:19:49,456 UTC [1595] ERROR   Program terminated abnormally, see stack trace below for more information
Traceback (most recent call last):
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_main.py", line 215, in run
    conf = main(args)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_main.py", line 143, in main
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_main.py", line 193, in process_recipe
    recipe.run()
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_recipe.py", line 1027, in run
    self.tasks, max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_task.py", line 483, in run_tasks
    _run_tasks_sequential(tasks)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_task.py", line 494, in _run_tasks_sequential
    task.run()
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_task.py", line 184, in run
    input_files.extend(task.run())
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/_task.py", line 185, in run
    self.output_files = self._run(input_files)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/preprocessor/__init__.py", line 284, in _run
    input_files, self.settings, self.order, debug=self.debug)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/preprocessor/__init__.py", line 209, in preprocess_multi_model
    debug)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/preprocessor/__init__.py", line 246, in preprocess
    result.append(function(item, **args))
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/cmor/fix.py", line 93, in fix_metadata
    checker(cube).check_metadata()
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/cmor/check.py", line 103, in check_metadata
    self._check_time_coord()
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/ESMValTool-2.0a1-py3.6.egg/esmvaltool/cmor/check.py", line 411, in _check_time_coord
    calendar=coord.units.calendar))
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/iris/coords.py", line 764, in convert_units
    self.bounds = self.units.convert(self.bounds, unit)
  File "/home/users/valeriu/anaconda3/envs/esmvaltool_v2_dev/lib/python3.6/site-packages/cf_units/__init__.py", line 2027, in convert
    result = ut2.date2num(ut1.num2date(result))
  File "netcdftime/_netcdftime.pyx", line 887, in netcdftime._netcdftime.utime.num2date
  File "netcdftime/_netcdftime.pyx", line 396, in netcdftime._netcdftime.DateFromJulianDay
ValueError: day is out of range for month

and this occurs for ACCESS1-3: when trying to convert from PROLEPTIC_GREGORIAN days since 0001-01-01 to JULIAN days since 1950-01-01 (this is done internally by ESMValTool as part of the CMOR checks). looking at your recipe - indeed I see piControl data from 386 to 405 years - @jvegasbsc can cf_units actually convert Gregorian to Julian with negative dates (to comply with the 1950 start)?

ValerioLembo commented 5 years ago

@valeriupredoi so IPSL-CM5A-LR is preprocessed correctly, in your case?

valeriupredoi commented 5 years ago

no @ValerioLembo I haven't tried the other files, I concentranted only on ACCESS1-3 to see what the heck is wrong with it.

@jvegasbsc I pinpointed the source of trouble: it is check.py at line 411 (note that fix_metadata works fine without the check step) - that line does the conversion between the current cube time coordinate to the CMOR standard coordinate with units days since 1950; the problem here is that the cube at hand has a proleptic_gregorian calendar and the conversion fails (I tried both iris 1.13 and 2.2) to convert a proleptic gregorian date from before the reference date. Note that it works fine if the calendar was 360_day ie the time points will be negative (negative time axis before 1950). In fact, proleptic_gregorian is a total bitch and even if you try to convert said axis from proleptic_gregorian to 360_day or standard (gregorian) with the same unit days since 0001-01-01 cf_units still craps up. So I would propose a physical conversion ie datetime operations within the cube time axis instead of relying on cf_units conversion. What say you?

ValerioLembo commented 5 years ago

Hi @valeriupredoi and @jvegasbsc, has there been any progress with the issue of preprocessing ACCESS1-3 and IPSL-CM5A-LR?

valeriupredoi commented 5 years ago

Unfortunately not on my side, was hoping Javier will write a fix file, the issue is the one I reported in my last post

Dr Valeriu Predoi. Computational scientist NCAS-CMS University of Reading Department of Meteorology Reading RG6 6BB United Kingdom

On Thu, 10 Jan 2019, 14:46 ValerioLembo <notifications@github.com wrote:

Hi @valeriupredoi https://github.com/valeriupredoi and @jvegasbsc https://github.com/jvegasbsc, has there been any progress with the issue of preprocessing ACCESS1-3 and IPSL-CM5A-LR?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESMValGroup/ESMValTool/issues/646#issuecomment-453100786, or mute the thread https://github.com/notifications/unsubscribe-auth/AbpCowVPewcI3w1naqtRH3dgSMh6o9WGks5vB0QhgaJpZM4XTNii .

ValerioLembo commented 5 years ago

Ok then let us wait for @jvegasbsc

bouweandela commented 5 years ago

Hi Valerio, I just had a look at some of your code. To avoid disappointment and a lot of additional work later, I think it would be really good idea if you could read the PEP8 Python style guide, because that is the kind of coding style we try to adhere to in the esmvaltool project. For a start, I recommend chopping up your code into smaller functions. You can run the prospector tool to check how you are doing in adhering to the standards locally (just run pip install prospector if you do not have it available yet, see also README.md). Another option is to open a pull request early on the public repository (it is no problem to do that long before you are ready, you can just keep working and indicate in the text when you are ready), that way Codacy will analyse your code.

ValerioLembo commented 5 years ago

Hi @bouweandela Thanks for looking into the code. I already had in mind to clean the code a little bit, because it is a bit messy. I would gladly send the pull request in the version2_development branch, if that is not a problem for anyone. The nested CDO Python bindings are meant to reduce the number of lines. These I would not touch, but all the rest can be made shorter, for sure.

bouweandela commented 5 years ago

@ValerioLembo Now that your pull request has been merged, can this issue be closed?

ValerioLembo commented 5 years ago

I do not know if @valeriupredoi and @jvegasbsc have finally addressed that problem with ACCESS1-3 and IPSL-CM5A-LR. it completely fell off my radar... Anyway, it is not necessarily related to my diagnostic tool, it is more an issue of the preprocessor...

bouweandela commented 5 years ago

I've changed the issue title so it is more clear what needs to be fixed.

valeriupredoi commented 4 years ago

@ValerioLembo does this need revisiting or has been solved for you in the meantime? :beer:

ValerioLembo commented 4 years ago

@valeriupredoi I made a quick test and it seems to me that the problem with ACCESS1-3 is solved, whereas the one with IPSL-CM5A-LR still persists...

zklaus commented 4 years ago

This issue deals with many problems. I am not sure if a dataset error is still one of them (one problem might be that 'months' is not an acceptable unit in the CF sense). @valeriupredoi could you please figure out the situation, probably with @ValerioLembo and file one or two issues using our dataset problem template if this is still relevant?

valeriupredoi commented 4 years ago

Yep, will be on it today :beer:

valeriupredoi commented 4 years ago

@ValerioLembo - I ran this toy recipe:

datasets:
  - {dataset: IPSL-CM5A-LR, project: CMIP5, exp: piControl, ensemble: r1i1p1, start_year: 1990, end_year: 2000}

diagnostics:
  Thermodyn_Diag:
    description: Thermodynamics diagnostics
    variables:
      ta:
        mip: day
      sftlf:
        mip: fx
      ua:
        mip: day
      va:
        mip: day
      wap:
        mip: day
    scripts:
      Thermodyn_Diag:
        script: thermodyn_diagtool/thermodyn_diagnostics.py
        wat: true
        lec: true
        entr: true
        met: 3
        lsm: true

and noticed no problem in the preprocessor, there was a problem in the diagnostic but not related to data structures - can you pls tell me which data spec for IPSL-CM5A-LR is still problematic? :beer:

ValerioLembo commented 4 years ago

@valeriupredoi The issue is with the monthly mean data, e.g. hfls...

valeriupredoi commented 4 years ago

like this? This one went fine too, preprocessor-wise:

datasets:
  - {dataset: IPSL-CM5A-LR, project: CMIP5, exp: piControl, ensemble: r1i1p1, start_year: 2370, end_year: 2390}

diagnostics:
  Thermodyn_Diag:
    description: Thermodynamics diagnostics
    variables:
      hfls:
        mip: Amon
      hfss:
        mip: Amon
    scripts:
      Thermodyn_Diag:
        script: thermodyn_diagtool/thermodyn_diagnostics.py
        wat: true
        lec: true
        entr: true
        met: 3
        lsm: true
valeriupredoi commented 4 years ago

...and this one has no preprocessing issues either - @ValerioLembo can you please post a recipe snippet that is failing for you at preprocessor stage, and aslo add the ful trace please :beer:

datasets:
  - {dataset: IPSL-CM5A-LR, project: CMIP5, exp: piControl, ensemble: r1i1p1, start_year: 1995, end_year: 2000}

diagnostics:
  Thermodyn_Diag:
    description: Thermodynamics diagnostics
    variables:
      hfls:
        mip: day
      hfss:
        mip: day
    scripts:
      Thermodyn_Diag:
        script: thermodyn_diagtool/thermodyn_diagnostics.py
        wat: true
        lec: true
        entr: true
        met: 3
        lsm: true
ValerioLembo commented 4 years ago

Hi there,

the problem with IPSL-CM5A-LR was related to an old version of the hfls dataset I had on my local repository. I changed that and it seems to work fine.

Unfortunately I cannot test the whole diagnostics, because there are some missing datasets on the DKRZ repository for piControl...

valeriupredoi commented 4 years ago

cool, cheers for checking @ValerioLembo - guess it's okay to close this for now and open another one if need be, if another dataset is broken :beer: