ESMValGroup / ESMValTool

ESMValTool: A community diagnostic and performance metrics tool for routine evaluation of Earth system models in CMIP
https://www.esmvaltool.org
Apache License 2.0
224 stars 128 forks source link

Cmorizer for the ESACCI-OC dataset is broken #2323

Closed remi-kazeroni closed 2 years ago

remi-kazeroni commented 3 years ago

Describe the bug While testing the cmorizers in #1657, I realized that the ESACCI-OC one is broken because the time coordinate is non-monotonous:

2021-09-28 14:24:24,718 UTC [36665] WARNING /work/bd0854/b309192/soft/miniconda3/envs/tool/lib/python3.9/site-packages/iris/fileformats/_pyke_rules/compiled_krb/fc_rules_cf_fc.py:2383: UserWarning: Failed to create 'time' dimension coordinate: The 'time' DimCoord points array must be strictly monotonic.
Gracefully creating 'time' auxiliary coordinate instead.
  warnings.warn(msg.format(name=str(cf_coord_var.cf_name),

2021-09-28 14:24:24,723 UTC [36665] INFO    Fixing latitude...
2021-09-28 14:24:24,794 UTC [36665] INFO    Fixing longitude...
Traceback (most recent call last):
  File "/work/bd0854/b309192/soft/miniconda3/envs/tool/bin/cmorize_obs", line 33, in <module>
    sys.exit(load_entry_point('ESMValTool', 'console_scripts', 'cmorize_obs')())
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/cmorize_obs.py", line 365, in main
    _cmor_reformat(config_user, obs_list)
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/cmorize_obs.py", line 424, in _cmor_reformat
    _run_pyt_script(in_data_dir, out_data_dir, dataset, config)
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/cmorize_obs.py", line 314, in _run_pyt_script
    module.cmorization(in_dir, out_dir, cmor_cfg, user_cfg)
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/cmorize_obs_esacci_oc.py", line 170, in cmorization
    extract_variable(var_info, raw_info, out_dir, glob_attrs)
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/cmorize_obs_esacci_oc.py", line 70, in extract_variable
    fix_coords(cube)
  File "/mnt/lustre01/pf/b/b309192/ESMValTool/esmvaltool/cmorizers/obs/utilities.py", line 178, in fix_coords
    cube_coord.points = \
  File "/work/bd0854/b309192/soft/miniconda3/envs/tool/lib/python3.9/site-packages/iris/coords.py", line 1438, in points
    self._values = points
  File "/work/bd0854/b309192/soft/miniconda3/envs/tool/lib/python3.9/site-packages/iris/coords.py", line 2547, in _values
    self._new_points_requirements(points)
  File "/work/bd0854/b309192/soft/miniconda3/envs/tool/lib/python3.9/site-packages/iris/coords.py", line 2535, in _new_points_requirements
    raise ValueError(emsg.format(self.name(), self.__class__.__name__))
ValueError: The 'longitude' DimCoord points array must be strictly monotonic.

I guess this may not have been an issue at the time this cmorizer was releases. @tomaslovato: could you please have a look since you authored this cmorizer? Thanks!

Please attach

tomaslovato commented 3 years ago

@remi-kazeroni I'll have a look in next days to this (and #2322), but I would definitely prefer to do it only after we complete and close PR #1812

valeriupredoi commented 3 years ago

it's funny how iris gracefully creates time as aux coord even if points are not monotonic, but then belches out at longitude for the same reason. This is something that should probably be raised with the ESACCI guys, maybe they'll thank us for it :grin: Can you post the time and lon points for us to have a look and see how bad it is maybe? :beer:

zklaus commented 3 years ago

Please note that we have picked this up in #2055.

valeriupredoi commented 3 years ago

good pointer @zklaus - that PR is dusty, I can give it a one go next week if need still be :+1:

remi-kazeroni commented 3 years ago

Please note that we have picked this up in #2055.

Thanks for the info. Hopefully the updated cmorizer from that PR won't suffer from this problem. Let's keep the issue open for now until #2055 is merged.

it's funny how iris gracefully creates time as aux coord even if points are not monotonic, but then belches out at longitude for the same reason. This is something that should probably be raised with the ESACCI guys, maybe they'll thank us for it 😁 Can you post the time and lon points for us to have a look and see how bad it is maybe? 🍺

Both the time and longitude contain duplicated points:

mass_concentration_of_chlorophyll_a_in_sea_water / (milligram m-3) (-- : 530; latitude: 1280; longitude: 2561)

time (first few points):

AuxCoord([1997-09-04 00:00:00, 1997-09-04 00:00:00, 1997-10-01 00:00:00,
       1997-10-01 00:00:00, 1997-11-01 00:00:00, 1997-11-01 00:00:00,
       1997-12-01 00:00:00, 1997-12-01 00:00:00, 1998-01-01 00:00:00,
       1998-01-01 00:00:00, 1998-02-01 00:00:00, 1998-02-01 00:00:00,

longitude (a few points):

-128.875      -128.875      -128.625      -128.625      -128.375
 -128.375      -128.125      -128.125      -127.875      -127.87499237
 -127.625      -127.62499237 -127.375      -127.37499237 -127.125
 -127.12499237 -126.875      -126.87499237 -126.625      -126.62499237
 -126.375      -126.37499237 -126.125      -126.12499237 -125.875
tomaslovato commented 3 years ago

@remi-kazeroni These inconsistencies on time and longitude are in the final cmorized file?

Actually I made and update to this cmorizer to use ESACCI-OC version 5, but I was waiting to close all the other pending activities before opening a new issue/PR/branch ...

remi-kazeroni commented 3 years ago

@remi-kazeroni These inconsistencies on time and longitude are in the final cmorized file?

Yes that is right. Using the cmorizer in the main branch on the version fv3.1 of the data.

Actually I made and update to this cmorizer to use ESACCI-OC version 5, but I was waiting to close all the other pending activities before opening a new issue/PR/branch ...

Sure I understand that (taking a look at the other PRs now) but opening issues about planned/ongoing work may help preventing duplicated efforts (see #2055 mentioned in this comment). You may want to take a look at that PR whenever time permits.

tomaslovato commented 3 years ago

@remi-kazeroni @zklaus I was not aware of #2055 and I updated this dataset to V5 only recently as I needed it. Actually #2055 has not advanced since last may, so I'm not sure if we want to update ESACCI-OC cmorizer in there or in a new PR?

remi-kazeroni commented 2 years ago

Closing this since #2055 is merged. The ESACCI-OC cmorizer has been updated (fv5.0) and the problem mentioned here is not relevant anymore.