Closed fserva closed 3 years ago
Could you please post the complete stack trace? Or attach the file run/main_log_debug.txt?
Here it is, thanks. main_log_debug.txt
I tried to reproduce this with hourly zg and it looks like the crash is due to the axis
attribute being set to the empty string, because loading the variable definition from the CMOR table, it picks up zg
with an alevel
generic coordinate instead of the usual vertical pressure coordinate. This happens because the variable is not present in the E1hr table, so then it searches the other mip tables and ends up picking the wrong one.
@jvegasbsc Could you look into this? We would need some way to select the correct variable from the CMOR table, i.e. one that has e.g. "dimensions": "longitude latitude plev19 time"
instead of "dimensions": "longitude latitude alevel time"
.
Here is an example recipe:
datasets:
- {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}
diagnostics:
timeseries:
description: zg
variables:
zg:
mip: E1hr
frequency: 1hrPt
start_year: 1990
end_year: 1990
scripts: null
This is described in #1029 and a possible solution is proposed in #1032.
That looks like a different but related problem. In this case, zg
occurs in the CMIP6 CMOR tables with both a generic and a non-generic vertical coordinate, depending on the table. To solve the problem here, it would be sufficient to be able to just specify that you want the version with the non-generic coordinate.
I realized that this is actually already possible, but it's a bit counter intuitive. The following recipe seems to work for me with ERA5 hourly geopotential height on pressure levels. Could you give it a try @fserva? Make sure to replace the frequency with the frequency that you actually need.
datasets:
- {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}
diagnostics:
timeseries:
description: test zg
variables:
zg:
mip: Amon
frequency: 1hrPt
start_year: 1990
end_year: 1990
scripts: null
Thanks @bouweandela, I've tried your recipe and I needed to do a change adding a documentation
section, otherwise it stops immediately.
When using the data I have, I got an error on the time axis ValueError: invalid day number provided in cftime.DatetimeGregorian(2000, 6, 31, 0, 0, 0, 0)
, which is not present when using cdo (but it makes sense since 31 June does not exist). This seems caused by _fix_monthly_time_coord
.
Suggestions or other issues I should consider? Thanks
I've tried various mip
and frequency
(in my case probably 6hrPt
is the correct one?), but that did not help
It looks like that happens because we were only aware of hourly and monthly ERA5 data when we wrote the fix file. It sees that the timestep is more than an hour, so it wrongly assumes that it is monthly. How can I download the six hourly data?
in my case probably 6hrPt is the correct one?
Yes, I think so.
Yes, good point. In fact the highest frequency you can get is hourly, but through the CDS API and the web form one can select specific hours. In this case I took 6-hourly to save some space, but of course I can start from the hourly.
Maybe not a problem, but I see that above you mentioned plev19
, while ERA5 provides 37 levels as far as I recall (and also in this case, I sub-selected some levels of interest).
That the pressure levels are different than in the CMOR tables is not a problem, this is almost always the case for non-CMIP data, so long as the coordinate is sufficiently similar it should work.
If you are working on some institutional machine (e.g. Jasmin), you might be able to use ERA5 data that is already available instead of downloading it yourself, see also ESMValGroup/ESMValTool#2183.
Thanks Bouwe, sorry for the long time to respond.
Now I have access also to the space esmeval
on JASMIN; would you suggest to use the default install (https://docs.esmvaltool.org/projects/esmvalcore/en/latest/quickstart/install.html#pre-installed-versions-on-hpc-clusters) or would it be better to make a new local one (since I understand there are changes in the core packages https://github.com/ESMValGroup/ESMValTool/issues/2198)?
Hello @bouweandela from what I can see in /gws/nopw/j04/esmeval/obsdata-v2/Tier3/ERA5/
there is only precipitation.
I do not have access to Mistral, so I guess I will simply download the files in their native format for testing.
Apparently there is not enough space on the group workspace on Jasmin, but there is ERA5 data available in another directory, see here for some recent discussion on the topic: https://github.com/ESMValGroup/ESMValCore/issues/1246.
Ok, I see.
In the Jasmin disks there are surface variables and some model level variables under /badc/ecmwf-era5/data/oper/an_ml/
, but the geopotential there is the surface orography short z(time, latitude, longitude)
. If I well recall it is possible to derive geopotential on pressure levels but that is maybe be too complex.
I have some hourly data for geopotential on pressure levels and I am planning to use those.
It may also be possible to augment the data available on jasmin by asking the right people there. For that, it will be very useful to consult the ERA5 documentation to specify exactly which data one is interested in.
I made some test with https://github.com/ESMValGroup/ESMValCore/issues/1136#issuecomment-850216172
The run finished correctly, even if no file was produced (I guess it is normal). There were some warnings
WARNING There were warnings in variable zg:
plev: does not contain 100000.0 Pa
plev: does not contain 70000.0 Pa
plev: does not contain 25000.0 Pa
related to the fact that units of data downloaded from the CDS are hPa. Maybe they can be changed manually in advance, or there is a better procedure?
In the original files I have level = 10, 30, 50, 100, 500, 850 ;
Files are produced, but deleted after a successful run. You can set remove_preproc_dir: false
if you want to keep the preprocessed files.
The warning is not about the units, but it tells you that some pressure levels that are mandatory for the zg variable's plev coordinate in the CMOR table are not present, this is only to be expected because this is not CMIP data.
Thanks Bouwe, your tip worked and the file is produced; levels are converted to Pa as for CMOR standard. I was trying to obtain the daily means, so I took some definition from the cmorizer:
datasets:
- {dataset: ERA5, project: native6, tier: 3, type: reanaly, version: 1}
preprocessors:
daily_mean:
daily_statistics:
operator: mean
diagnostics:
timeseries:
description: test zg
variables:
zg:
preprocessor: daily_mean
mip: day
frequency: 1hrPt
start_year: 2006
end_year: 2006
scripts: null
Update: This worked (note I did not include add_one_day
!
Note that mip: day
, using mip: E1hr
(table for hourly extra variables) failed.
Note that mip: day, using mip: E1hr (table for hourly extra variables) failed.
It probably doesn't matter too much what mip you choose, but to signify your intent I think mip: E1hr
would make most sense, because the variable specifies the input data. What was the error message?
Yes, I also think so. It's iris complaining about the coordinates (not new likely)...
2021-08-06 14:24:56,052 UTC [499434] ERROR Failed to run fix_metadata([<iris 'Cube' of geopotential / (m**2 s**-2) (time: 8760; air_pressure: 6; latitude: 181; longitude: 360)>], {'preprocessor': 'daily_mean', 'mip': 'E1hr', 'frequency': '1hrPt', 'start_year': 2006, 'end_year': 2006, 'variable_group': 'zg', 'short_name': 'zg', 'diagnostic': 'timeseries', 'dataset': 'ERA5', 'project': 'native6', 'tier': 3, 'type': 'reanaly', 'version': 1, 'recipe_dataset_index': 0, 'alias': 'ERA5', 'original_short_name': 'zg', 'standard_name': 'geopotential_height', 'long_name': 'Geopotential Height', 'units': 'm', 'modeling_realm': ['atmos'], 'filename': '/work/users/serva/esmvaltool_output/recipe_cmor_era5_20210806_142455/preproc/timeseries/zg/native6_ERA5_reanaly_1_E1hr_zg_2006-2006.nc', 'check_level': <CheckLevels.DEFAULT: 3>})
2021-08-06 14:24:56,980 UTC [499434] INFO Maximum memory used (estimate): 0.0 GB
2021-08-06 14:24:56,981 UTC [499434] INFO Sampled every second. It may be inaccurate if short but high spikes in memory consumption occur.
2021-08-06 14:24:56,981 UTC [499434] ERROR Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 433, in run
fire.Fire(ESMValTool())
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 410, in run
process_recipe(recipe_file=recipe, config_user=cfg)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_main.py", line 104, in process_recipe
recipe.run()
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_recipe.py", line 1434, in run
self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 674, in run
self._run_sequential()
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 685, in _run_sequential
task.run()
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/_task.py", line 252, in run
self.output_files = self._run(input_files)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 482, in _run
product.apply(step, self.debug)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 351, in apply
self.cubes = preprocess(self.cubes, step, **self.settings[step])
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 295, in preprocess
result.append(_run_preproc_function(function, items, settings))
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/preprocessor/__init__.py", line 281, in _run_preproc_function
return function(items, **kwargs)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/fix.py", line 114, in fix_metadata
cube_list = fix.fix_metadata(cube_list)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/_fixes/native6/era5.py", line 377, in fix_metadata
cube = self._fix_coordinates(cube)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/esmvalcore/cmor/_fixes/native6/era5.py", line 329, in _fix_coordinates
coord = cube.coord(axis=axis)
File "/work/users/serva/miniconda3/envs/esmval202107/lib/python3.9/site-packages/iris/cube.py", line 1829, in coord
raise iris.exceptions.CoordinateNotFoundError(msg)
iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found none.'
Ok, yes, I forgot about that. This is the problem from the top post and it can indeed be solved by picking a table that does contain zg
variable with the right coordinates https://github.com/ESMValGroup/ESMValCore/issues/1136#issuecomment-850216172.
So the mip: day
is a good choice in this context. For my needs the problem is then solved, I'll let you close the issue in case. Thanks all for the tips!
Hello all, I would need to process (six) hourly ERA5 data in order to prepare input for the zmnam recipe, which requires daily means geopotential height.
I've tried to process it by using https://github.com/ESMValGroup/ESMValTool/blob/master/esmvaltool/recipes/cmorizers/recipe_era5.yml, however an error is raised when building the cube:
ERROR [15512] Failed to run fix_metadata([<iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>, <iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>, [...], <iris 'Cube' of geopotential / (m**2 s**-2) (time: 124; pressure_level: 19; latitude: 721; longitude: 1440)>], {'project': 'native6', 'dataset': 'ERA5', 'short_name': 'zg', 'mip': '6hrPlev', 'frequency': '6hr', 'check_level': <CheckLevels.DEFAULT: 3>})
, likely due to iris This seems to be an iris problemiris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found none.'
I believe this may be due to the missing
standard_names
, see the header of the raw files:Maybe assigning default attributes would be enough? See e.g. https://groups.google.com/g/scitools-iris/c/9IwC2Rr7xm
Please note the discussion started in: https://github.com/ESMValGroup/ESMValTool/issues/1909
If anyone succeeded cmorizing ERA5 pressure level data, any hint would be useful.
FYI @remi-kazeroni @zklaus @bouweandela