ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
224 stars 128 forks source link

recipe_perfmetrices crashes due to standard name error #1561

Closed mattiarighi closed 3 years ago

mattiarighi commented 4 years ago
Traceback (most recent call last):
  File "/pf/b/b309057/SOFTWARE/miniconda3/envs/esmvaltool/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/_task.py", line 683, in _run_task
    output_files = task.run()
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/_task.py", line 238, in run
    self.output_files = self._run(input_files)
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/preprocessor/__init__.py", line 426, in _run
    product.apply(step, self.debug)
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/preprocessor/__init__.py", line 294, in apply
    self.cubes = preprocess(self.cubes, step, **self.settings[step])
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/preprocessor/__init__.py", line 236, in preprocess
    result.append(_run_preproc_function(function, items, settings))
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/preprocessor/__init__.py", line 222, in _run_preproc_function
    return function(items, **kwargs)
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/cmor/fix.py", line 140, in fix_metadata
    cube = checker(cube).check_metadata()
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/cmor/check.py", line 101, in check_metadata
    self._check_var_metadata()
  File "/mnt/lustre01/pf/b/b309057/ESMValTool/core/esmvalcore/cmor/check.py", line 207, in _check_var_metadata
    self._cube.standard_name = self._cmor_var.standard_name
  File "/pf/b/b309057/SOFTWARE/miniconda3/envs/esmvaltool/lib/python3.7/site-packages/iris/_cube_coord_common.py", line 248, in standard_name
    self._standard_name = get_valid_standard_name(name)
  File "/pf/b/b309057/SOFTWARE/miniconda3/envs/esmvaltool/lib/python3.7/site-packages/iris/_cube_coord_common.py", line 84, in get_valid_standard_name
    name))
ValueError: 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol' is not a valid standard_name

Not sure what is causing the problem. The last successfull test with this recipe was on Feb 18.

valeriupredoi commented 4 years ago

perfmetrics is a monster and would really not want to run it - can you pls tell us what's the offending dataset (full specs, so I can look it up on Jasmin too, might be that the actual data file changed in the meantime) :beer:

mattiarighi commented 4 years ago

Investigating... :mag:

The variable should be obs550lt1aer.

jvegreg commented 4 years ago

It is a legal standard_name, so it looks like an Iris problem. Which version are you using?

From last CF table:

atmosphere_optical_thickness_due_to_pm1_ambient_aerosol_particles
alias: atmosphere_optical_thickness_due_to_pm1_ambient_aerosol
mattiarighi commented 4 years ago
iris                      2.4.0                    py37_0    conda-forge
mattiarighi commented 4 years ago

Another hint:

2020-03-03 11:25:05,987 UTC [10236] ERROR   Failed to run fix_metadata([<iris 'Cube' of Length of average period / (days) (time: 60)>, <iris 'Cube' of Start time for average period / (days since 1860-01-01 00:00:00) (time: 60)>, <iris 'Cube' of Ambient Fine Aerosol Opitical Thickness at 550 nm / (1) (time: 60; latitude: 90; longitude: 144)>, <iris 'Cube' of End time for average period / (days since 1860-01-01 00:00:00) (time: 60)>], {'project': 'CMIP5', 'dataset': 'GFDL-CM3', 'short_name': 'od550lt1aer', 'mip': 'aero', 'frequency': 'mon'})
jvegreg commented 4 years ago

iris 2.4.0 py37_0 conda-forge

It has the alias in its standard_names XML... Iris should accept it

mattiarighi commented 4 years ago

Apparently it is failing to apply a fix to GFDL-CM3

jvegreg commented 4 years ago

It is the automatic one correcting the standard_name when it does not match the CMOR tables. But Iris is complaining it is not legal for some reason

valeriupredoi commented 4 years ago

right so that dam file has got no standard_name and the only way to load is via long_name - see here (standard_name in commented out bit):

c = iris.load("/badc/cmip5/data/cmip5/output1/NOAA-GFDL/GFDL-CM3/historical/mon/aerosol/aero/r1i1p1/latest/od550lt1aer/od550lt1aer_aero_GFDL-CM3_historical_r1i1p1_200001-200412.nc", "Ambient Fine Aerosol Opitical Thickness at 550 nm")  #, "atmosphere_optical_thickness_due_to_pm1_ambient_aerosol")
print(c[0].standard_name)
None

it has, however, an

invalid_standard_name: atmosphere_optical_thickness_due_to_pm1_ambient_aerosol

and c[0].var_name is indeed od550lt1aer Data issue, not iris issue :beer:

valeriupredoi commented 4 years ago

BTW where is the entry for od550lt1aer in perfmetrics recipe? Checked out latest master and it ain't there

mattiarighi commented 4 years ago

Line 1847

valeriupredoi commented 4 years ago

Line 1847

nevermind I'm blind :see_no_evil:

jvegreg commented 4 years ago

Data issue, not iris issue

Iris issue because it is a legal standard_name and should not be treated as invalid.

valeriupredoi commented 4 years ago

we battling in legalities @jvegasbsc :grin: is that invalid_standard_name attr set by iris or by the data constructor? Coz if it's the second then it's the constructor (cmorizer/ESGF) at fault

mattiarighi commented 4 years ago

But it used to work, were there changes in the fix files?

valeriupredoi commented 4 years ago

I am actually using iris=2.3 for this case so am saying - not iris the problemo :grin: Playing devil's advocate @bjlittle owes me a :beer:

mattiarighi commented 4 years ago

This means you could reproduce the issue?

valeriupredoi commented 4 years ago

yes, outside esmvaltool, see https://github.com/ESMValGroup/ESMValTool/issues/1561#issuecomment-593936795

jvegreg commented 4 years ago

we battling in legalities @jvegasbsc grin is that invalid_standard_name attr set by iris or by the data constructor? Coz if it's the second then it's the constructor (cmorizer/ESGF) at fault

Iris uses that attribute when it detects a standard_name that is not part of the valid list, so is Iris not the CMOR who is using it. In our case, this is wrong because it is a valid standard_name, search it here http://cfconventions.org/Data/cf-standard-names/71/build/cf-standard-name-table.html.

We can also reproduce the error without any file:

>>> import numpy as np
>>> import iris.cube
>>> cube = iris.cube.Cube(np.arange(10))
>>> cube.standard_name = 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/Earth/jvegas/.conda/envs/esmvaltool/lib/python3.6/site-packages/iris/_cube_coord_common.py", line 206, in standard_name
    self._standard_name = get_valid_standard_name(name)
  File "/home/Earth/jvegas/.conda/envs/esmvaltool/lib/python3.6/site-packages/iris/_cube_coord_common.py", line 58, in get_valid_standard_name
    name))
ValueError: 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol' is not a valid standard_name
>>> cube.standard_name = 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol_particles'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/Earth/jvegas/.conda/envs/esmvaltool/lib/python3.6/site-packages/iris/_cube_coord_common.py", line 206, in standard_name
    self._standard_name = get_valid_standard_name(name)
  File "/home/Earth/jvegas/.conda/envs/esmvaltool/lib/python3.6/site-packages/iris/_cube_coord_common.py", line 58, in get_valid_standard_name
    name))
ValueError: 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol_particles' is not a valid standard_name
>>> cube.standard_name = 'latitude'                                                         
>>> 
valeriupredoi commented 4 years ago

yep, same here, assigning it to the GFDL-CM3 cube:

Traceback (most recent call last):
  File "test_aer.py", line 7, in <module>
    cc.standard_name = "atmosphere_optical_thickness_due_to_pm1_ambient_aerosol"
  File "/home/users/valeriu/anaconda3R/envs/esmvaltool/lib/python3.7/site-packages/iris/_cube_coord_common.py", line 206, in standard_name
    self._standard_name = get_valid_standard_name(name)
  File "/home/users/valeriu/anaconda3R/envs/esmvaltool/lib/python3.7/site-packages/iris/_cube_coord_common.py", line 58, in get_valid_standard_name
    name))
ValueError: 'atmosphere_optical_thickness_due_to_pm1_ambient_aerosol' is not a valid standard_name

@jvegasbsc is correct and I have to retract my support for iris in this matter :beer:

valeriupredoi commented 4 years ago

funny thing is that name is actually present in STD_NAMES dict in /anaconda3R/envs/esmvaltool/lib/python3.7/site-packages/iris/std_names.py - also, why is that a fixed object (STD_NAMES) and not a collection from the CF library, @bjlittle ? :beer:

valeriupredoi commented 4 years ago

ok the problem is in iris/_cube_coord_common.py l/47:

        valid_name_pattern = re.compile(r'''^([a-zA-Z_]+)( *)([a-zA-Z_]*)$''')
        name_groups = valid_name_pattern.match(name)

name_groups is returned as None for name_groups = valid_name_pattern.match("atmosphere_optical_thickness_due_to_pm1_ambient_aerosol") and even if it is a valid name it's discarded right away

valeriupredoi commented 4 years ago

https://github.com/SciTools/iris/issues/3677

valeriupredoi commented 4 years ago

@jvegasbsc have a look at https://github.com/SciTools/iris/issues/3677#issuecomment-595177247 and then my cheeky comment https://github.com/SciTools/iris/issues/3677#issuecomment-595180142 - do you mind taking care of it? :beer:

jvegreg commented 4 years ago

@jvegasbsc have a look at SciTools/iris#3677 (comment) and then my cheeky comment SciTools/iris#3677 (comment) - do you mind taking care of it?

Setting an Iris development environment right now...

mattiarighi commented 4 years ago

Any news here?

bjlittle commented 4 years ago

@mattiarighi Just merged @jvegasbsc PR https://github.com/SciTools/iris/pull/3679 on iris to resolve this issue here.

Thanks Javier :beers:

This change will be included in the forthcoming iris v3.0.0 release, which is Python3 only.

We're aiming to get v3.0.0 out this summer... when I know firmer dates I'll be sure to let you guys know. Alternatively just ping me (I don't mind) and ask if you need an ETA.

Cheers :smile:

mattiarighi commented 4 years ago

Thanks @bjlittle

mattiarighi commented 4 years ago

Remove from the milestone, since we will not get Iris 3 before ESMValTool v2.0 release.

bouweandela commented 3 years ago

I think this problem has been fixed by now. Is any work on this still needed?

valeriupredoi commented 3 years ago

@remi-kazeroni have you seen this happen in @bouweandela's last run of all the recipes from yesterday?

remi-kazeroni commented 3 years ago

The recipe_perfmetrics_CMIP5.yml was not fully ran in #2354 since the automoatic download didn't work due to the unavailable ESGF node (see comment).

I have made a few attempts to run the recipe enabling the automatic download feature:

So I guess we could uncomment the "Needs Iris3.0" bit to close this issue. But I'm not sure what to do with the error related to the mmm.

zklaus commented 3 years ago

@remi-kazeroni, ESMValGroup/ESMValCore#1372 has been resolved. Could you try again? Don't forget to re-create the environment to get the second release candidate of the core.

remi-kazeroni commented 3 years ago

@remi-kazeroni, ESMValGroup/ESMValCore#1372 has been resolved. Could you try again? Don't forget to re-create the environment to get the second release candidate of the core.

I have just recreated an environment with ESMValCore 2.4.0rc2 and rerun the od550lt1aer bit of the recipe_perfmetrics_CMIP5.yml succesfully. I can make a PR to uncomment the "Needs Iris3" section of the recipe_perfmetrics_CMIP5.yml in order to close this PR. In this PR, we could also uncomment this line of the examples/recipe_check_obs.yml related to od550lt1aer. I have just tested that succesfully with the latest environment.

zklaus commented 3 years ago

Sounds good!