EC-Earth / ece2cmor3

Post-processing and cmorization of ec-earth output
Apache License 2.0
13 stars 6 forks source link

Mixed spectral and grid point messages in filtered files for u/v (131/132) #533

Closed uwefladrich closed 4 years ago

uwefladrich commented 4 years ago

Trying to cmorise model-level data for CORDEX, I have encountered the following probem:

I use a json cordex-cmip data request file for my ece2cmor run (just 28 variables in four tables) and I get most of the variables, except some stuff from 6hrLev (which is, of course, the interesting part). Particularly, I have problems with 6hrLev/ua and 6hrLev/va (ta and hus from 6hrLev work).

The first indication of a problem is an error message:

ERROR:ece2cmor3.ifs2cmor: Cannot read other grids then regular gaussian grids, \
current grid type read from file [...]/ua_6hrLev.nc was generic

and the same for va_6hrLev.nc.

Later, ece2cmor hangs after having processed 6hrLev/ta, presumably while trying to process ua/va.

I've had a look at the temp directory with the splitted and converted files. Particularly, I have compared the temporary files related to the t, u and v variables (grib codes 130, 131, 132). All files are present, specifically I find the files

130.128.109.6
130.128.210.6
131.128.109.6
131.128.210.6
132.128.109.6
132.128.210.6

Looking at the *.109.* variants and comparing 130 with 131/132, I find that while cdo sinfo reports spectral coordinates and 91 levels for 130.128.109.6, it says that 13[12].128.109.6 are reduced Gaussian with just one level.

However, all 13[012].128.109.6 files had about the same size, which would be strange if some are just one level on Gaussian grid.

It turns out that the files 13[12].128.109.6 contain a mixture of grib messages in spectral and grid point coordinates. In fact, the files seem to contain the (correct?) messages for the spectral values, but additionally data on the reduced Gaussian grid. This seems to throw ece2cmor/cdo off the path.

Could it be that the splitting process does something wrong for the u/v variables, at least in the context of model levels?

It seems possible that this is related to #496.

uwefladrich commented 4 years ago

Some more info: The additional grid point grib messages in the 131/132 files seem to belong to level 100, indicating that there could indeed be a connection to #496, where things are about 100m winds.

uwefladrich commented 4 years ago

It appears that the grib_filter phase of ece2cmor considers both the ICM*GG and ICM*SH files when trying to filter for u and v. In ICM*SH, it finds u and v on hybrid levels 1-91, which is correct. But it also finds u and v on hybrid level=100 in the ICM*GG file, and adds it to the filtered intermediate file.

oloapinivad commented 4 years ago

H Uwe,

yes it seems similar to #496.

(One of) the issue is that in grib format there is no difference between height levels and model levels (see here https://dev.ec-earth.org/issues/659#note-19 and link there in). This means that we can try to avoid this specific issue selecting variables from SH and GG file, but we always need to be cautious when using both height and model level type.

goord commented 4 years ago

Hi Uwe, the problem is indeed the incorrect level type indicator for height in the grib files produced by ec-earth. This is problem that reoccurs every time I change something in the filtering...

I am actually surprised that u100 and v100 are in the GG file... do they originate from grib codes 256/246 somehow?

uwefladrich commented 4 years ago

Disclaimer: Despite a couple of years in this business, I'm still an amateur when it comes to grib files. Forgive me any naive or stupid ideas ;-)

I understand now (from the links above) that 100m wind (u/v) should not be on hybrid levels in the output. No idea, though, which grib codes they come from.

Still, the 100m winds can be distinguished from spectral u/v values on model levels by the gridType. So the tuple (typeOfLevel, gridType) should be distinct for u/v on model levels, pressure levels and 100m wind, shouldn't it? Even if one doesn't take into account which file the variables come from (GG vs. SH). Or is that not how the filtering works?

goord commented 4 years ago

Hi Uwe, that is how the filtering works: it actually translates all message headers into unique tuples (level, level type, grib code, grid type), but it gets complicated with the 100m winds: they can be either in the output as 2D fields in the GG files or fullpos-processed fields in the SH files.

Can you grib_ls in the original output how the 100m u/v are saved by EC-Earth, with which grib codes, level grid type?

uwefladrich commented 4 years ago
> grib_ls -w shortName=u,level=100 \
  -p paramId,shortName,dataDate,dataTime,typeOfLevel,level,packingType,gridType \
  ICMGGt613+195404 | head
ICMGGt613+195404
paramId      shortName    dataDate     dataTime     typeOfLevel  level        packingType  gridType     
131          u            19540401     600          hybrid       100          grid_simple  reduced_gg  
131          u            19540401     1200         hybrid       100          grid_simple  reduced_gg  
131          u            19540401     1800         hybrid       100          grid_simple  reduced_gg  
131          u            19540402     0            hybrid       100          grid_simple  reduced_gg  
131          u            19540402     600          hybrid       100          grid_simple  reduced_gg  
131          u            19540402     1200         hybrid       100          grid_simple  reduced_gg  
131          u            19540402     1800         hybrid       100          grid_simple  reduced_gg  
131          u            19540403     0            hybrid       100          grid_simple  reduced_gg  

Like this?

uwefladrich commented 4 years ago

... and for the SH file:

> grib_ls -w shortName=u,level=100 \
  -p paramId,shortName,dataDate,dataTime,typeOfLevel,level,packingType,gridType \
  ICMSHt613+195404 | head
ICMSHt613+195404
paramId      shortName    dataDate     dataTime     typeOfLevel  level        packingType  gridType     
131          u            19540401     600          isobaricInhPa  100          spectral_complex  sh          
131          u            19540401     1200         isobaricInhPa  100          spectral_complex  sh          
131          u            19540401     1800         isobaricInhPa  100          spectral_complex  sh          
131          u            19540402     0            isobaricInhPa  100          spectral_complex  sh          
131          u            19540402     600          isobaricInhPa  100          spectral_complex  sh          
131          u            19540402     1200         isobaricInhPa  100          spectral_complex  sh          
131          u            19540402     1800         isobaricInhPa  100          spectral_complex  sh          
131          u            19540403     0            isobaricInhPa  100          spectral_complex  sh
uwefladrich commented 4 years ago

The latter fields are of course not 100m winds but u at 100hPa level. I just thought the problem was somehow to distinguish different level=100 fields.

goord commented 4 years ago

The issue is that the 'hybrid' typeoflevel (which should be height) in the GG file is picked up as a model level by the filtering. I can fix this by always ignoring u/v in the GG files unless height levels are requested... a bit shaky though...

uwefladrich commented 4 years ago

Do we have any variable on any height/altitude level than 100? I found no indication for that in the output I checked (but that test was limited). Because if not, the obvious dirty hack would be to only consider hybrid levels 1-91 as model levels and height level otherwise. We will get away with that for our current EC-Earth version, until we use 137 level grids in the future.

uwefladrich commented 4 years ago

My own little check says that 100 is the only hybrid level used in GG and SH files outside the 1..91 range and that all levels 1-91 have the same number of entries, indicating that none of them is double-used as height level. This is what I used:

grib_ls -w typeOfLevel=hybrid -p level <FILE> | sort -n | uniq -c
goord commented 4 years ago

Hi Uwe can you test again with the latest version of the master?

uwefladrich commented 4 years ago

Hi Gijs, thanks a lot for working on this! A test is running, but I still seem to get mixed grid types in the filtered files:

> grib_ls 131.128.109.6 -p [...] | head -n20
131.128.109.6
paramId      shortName    typeOfLevel  level        dataDate     dataTime     packingType  gridType     
131          u            hybrid       100          19540101     0            grid_simple  reduced_gg  
131          u            hybrid       100          19540101     600          grid_simple  reduced_gg  
131          u            hybrid       100          19540101     1200         grid_simple  reduced_gg  
131          u            hybrid       100          19540101     1800         grid_simple  reduced_gg  
131          u            hybrid       100          19540102     0            grid_simple  reduced_gg  
131          u            hybrid       100          19540102     600          grid_simple  reduced_gg  
131          u            hybrid       100          19540102     1200         grid_simple  reduced_gg  
131          u            hybrid       100          19540102     1800         grid_simple  reduced_gg  
131          u            hybrid       1            19540101     0            spectral_complex  sh          
131          u            hybrid       2            19540101     0            spectral_complex  sh          
131          u            hybrid       3            19540101     0            spectral_complex  sh          
131          u            hybrid       4            19540101     0            spectral_complex  sh          
131          u            hybrid       5            19540101     0            spectral_complex  sh          
131          u            hybrid       6            19540101     0            spectral_complex  sh          
131          u            hybrid       7            19540101     0            spectral_complex  sh          
131          u            hybrid       8            19540101     0            spectral_complex  sh          
131          u            hybrid       9            19540101     0            spectral_complex  sh          
131          u            hybrid       10           19540101     0            spectral_complex  sh 
goord commented 4 years ago

Hi @uwefladrich apparently the issue applies to all height-level spectral fields, and previous fixes have broken the correct filtering of model levels. I hope a have a definitive fix in the branch height_levs_filter, can you test that one on your output data?

uwefladrich commented 4 years ago

Okay, I tested the branch. It gives me:

Traceback (most recent call last):
  File "[...]/bin/ece2cmor", line 11, in <module>
    load_entry_point('ece2cmor3==1.2.1', 'console_scripts', 'ece2cmor')()
  File "[...]/lib/python2.7/[...]/ece2cmor3/ece2cmor.py", line 141, in main
    cdothreads=args.ncdo)
  File "[...]/lib/python2.7/[...]/ece2cmor3/ece2cmorlib.py", line 182, in perform_ifs_tasks
    tempdir=tempdir, autofilter=auto_filter)):
  File "[...]/lib/python2.7/[...]/ece2cmor3/ifs2cmor.py", line 136, in initialize
    grib_filter.initialize(ifs_gridpoint_files_, ifs_spectral_files_, temp_dir_)
  File "[...]/lib/python2.7/[...]g/ece2cmor3/grib_filter.py", line 50, in initialize
    varsfreq.update(inspect_day(grib_file.create_grib_file(gpf), grid=cmor_source.ifs_grid.point))
  File "[...]/lib/python2.7/[...]/ece2cmor3/grib_filter.py", line 92, in inspect_day
    key = get_record_key(gribfile, grid) + (grid,)
TypeError: get_record_key() takes exactly 1 argument (2 given)
goord commented 4 years ago

I'm sorry, that was a leftover that I forgot to clean up, I fixed that problem, you can test again

uwefladrich commented 4 years ago

No problem, I'm happy to test!

Still, we're not there yet:

Traceback (most recent call last):
  File "[...]/bin/ece2cmor", line 11, in <module>
    load_entry_point('ece2cmor3==1.2.1', 'console_scripts', 'ece2cmor')()
  File "[...]3/lib/python2.7/[...]/ece2cmor3/ece2cmor.py", line 141, in main
    cdothreads=args.ncdo)
  File "[...]/lib/python2.7/[...]/ece2cmor3/ece2cmorlib.py", line 182, in perform_ifs_tasks
    tempdir=tempdir, autofilter=auto_filter)):
  File "[...]/lib/python2.7/[...]/ece2cmor3/ifs2cmor.py", line 136, in initialize
    grib_filter.initialize(ifs_gridpoint_files_, ifs_spectral_files_, temp_dir_)
  File "[...]/lib/python2.7/[...]/ece2cmor3/grib_filter.py", line 50, in initialize
    varsfreq.update(inspect_day(grib_file.create_grib_file(gpf), grid=cmor_source.ifs_grid.point))
  File "[...]/lib/python2.7/[...]/ece2cmor3/grib_filter.py", line 92, in inspect_day
    key = get_record_key(gribfile) + (grid,)
  File "[...]/lib/python2.7/[...]/ece2cmor3/grib_filter.py", line 123, in get_record_key
    gridtype = gribfile.get_field(grib_file.grid_key)
  File "[...]/lib/python2.7/[...]/ece2cmor3/grib_file.py", line 78, in get_field
    return gribapi.grib_get_long(self.record, name)
  File "[...]/lib/python2.7/[...]/gribapi/gribapi.py", line 88, in modified
    return _func_(**kw)
  File "[...]/lib/python2.7/[...]/gribapi/gribapi.py", line 782, in grib_get_long
    GRIB_CHECK(err)
  File "[...]/lib/python2.7/[...]/gribapi/gribapi.py", line 88, in modified
    return _func_(**kw)
  File "[...]/lib/python2.7/[...]/gribapi/gribapi.py", line 136, in GRIB_CHECK
    errors.raise_grib_error(errid)
  File "[...]/lib/python2.7/[...]/gribapi/errors.py", line 235, in raise_grib_error
    raise ERROR_MAP[errid](errid)
gribapi.errors.KeyValueNotFoundError: Key/value not found
goord commented 4 years ago

Hmm apparently gridType is not a valid gribapi key for the EC-Earth output files (?). I used a workaround in the newest commit by looking at the file name, you may test once again Uwe.

uwefladrich commented 4 years ago

It took me a while to get the next test done, now I've managed to do it. Unfortunately, I get a new error:

Traceback (most recent call last):
  File "[...]/ece2cmor3/bin/ece2cmor", line 11, in <module>
    load_entry_point('ece2cmor3==1.2.1', 'console_scripts', 'ece2cmor')()
  File "[...]/ece2cmor3/lib/[...]/ece2cmor.py", line 141, in main
    cdothreads=args.ncdo)
  File "[...]/ece2cmor3/lib/[...]/ece2cmorlib.py", line 186, in perform_ifs_tasks
    ifs2cmor.execute(ifs_tasks, nthreads=taskthreads)
  File "[...]/ece2cmor3/lib/[...]/ifs2cmor.py", line 194, in execute
    pool.map(cmor_worker, tasks)
  File "[...]/ece2cmor3/lib/python2.7/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "[...]/ece2cmor3/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
multiprocessing.pool.MaybeEncodingError: \
    Error sending result: 'CMORError("Problem with 'cmor.load_table'. \
    Please check the logfile (if defined).",)'. \
    Reason: 'PicklingError("Can't pickle <class '_cmor.CMORError'>: \
        import of module _cmor failed",)'
('caught signal', <cdo.Cdo object at 0x2b16ddff29d0>, 15, <frame object at 0x2b16e406ea00>)
('caught signal', <cdo.Cdo object at 0x2b16ddff3250>, 15, <frame object at 0x2b16e406ea00>)
[...]
goord commented 4 years ago

This is very strange, it seems as if some cmor table cannot be read, you are using an unmodified version of the tables Uwe?

uwefladrich commented 4 years ago

Do you mean ece2cmor3/resources/cmip6-cmor-tables? That's unchanged in my git clone. I do have a small change in ece2cmor3/resources/ifspar.json if that makes any difference (adding 172.128 --> sftlf).

goord commented 4 years ago

Hi Uwe can you run once single-threaded (with --npp 1)? Hopefully this gives a better stacktrace...

uwefladrich commented 4 years ago

I didn't get a better stacktrace (in fact, I didn't get any at all), but I found that the cmor library is complaining:

Error: Table  is defined for cmor_version 3.500000, this library version is: 3.4.0, 3.400000

I just didn't see this in my previous tests. There are more errors following this, but I guess I need to fix this first? How? Can/should I somehow update ece2cmor3/resources/cmip6-cmor-tables?

uwefladrich commented 4 years ago

... or rather: Can/should I update my cmor lib? EDIT: I realise (i) that it says cmor-3.5.0 in the requirement list of ece2cmor3 and (ii) that I mixed cmor installations in my conda envs. Please hang on, I'm updating my stuff... (sorry for the noise)

treerink commented 4 years ago

Hi Uwe,

You work in the height_levs_filter branch I guess, which is fully update with the master (I checked). But I think you need from the ece2cmor root dir the following:

git submodule update --init --recursive

to update your tables.

But also it could be that you need to rebuilt your ece2cmor3 environment if you did not since the update to cmor 3.5 last month.

uwefladrich commented 4 years ago

Hi Thomas, Yes, I am working in the height_levs_filter. Right now my problem was that the cmor package in my ece2cmor3 environment was not up-to-date. I have a separate cmor environment, which was updated, which is why I didn't realise I was using an old cmor lib. My tables are up-to-date.

uwefladrich commented 4 years ago

I have updated cmor and re-run the test, but now it seems I'm back to square one. Running in parallel (I tested 12 and 32 threads) I get multiprocessing error:

Traceback (most recent call last):
  File "[...]/ece2cmor3/bin/ece2cmor", line 11, in <module>
    load_entry_point('ece2cmor3==1.2.1', 'console_scripts', 'ece2cmor')()
  File "[...]/ece2cmor3/lib/[...]/ece2cmor.py", line 141, in main
    cdothreads=args.ncdo)
  File "[...]/ece2cmor3/lib/[...]/ece2cmorlib.py", line 186, in perform_ifs_tasks
    ifs2cmor.execute(ifs_tasks, nthreads=taskthreads)
  File "[...]/ece2cmor3/lib/[...]/ifs2cmor.py", line 194, in execute
    pool.map(cmor_worker, tasks)
  File "[...]/ece2cmor3/lib/[...]/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "[...]/ece2cmor3/lib/[...]/multiprocessing/pool.py", line 572, in get
    raise self._value
TypeError: long() argument must be a string or a number, not 'NoneType'

Running with only one thread, i.e. --npp 1, ece2cmor completes, but I am still left with mixed grid types in 131.128.109.6 and 132.128.109.6.

goord commented 4 years ago

Uwe do you have an ftp server available? I think it's best if I have your data for debugging. We could also use the ECMWF for this...

uwefladrich commented 4 years ago

Yes, I can upload files to our publisher. What/which files would you need exactly?

goord commented 4 years ago

Just a month of sh+gg grib files to run the filtering+cmorization myself

uwefladrich commented 4 years ago

Hi, Sorry, it has taken me a while. Here I have uploaded one month of model output including model levels:

http://exporter.nsc.liu.se/12d934d851a14f1a9ccecb78c7c6477e
rsync://exporter.nsc.liu.se/12d934d851a14f1a9ccecb78c7c6477e

The links will be valid for 30 days.

goord commented 4 years ago

Hi Uwe, I fixed the problem in the height_levs_filter branch and was able to cmorize 3D fields using your grib files, can you test your full cmorization with this revision?

treerink commented 4 years ago

Note that #496 is closed, because different grib codes are used for ua100m & va100m. This might have implications for this issue when run with the current latest r7144-control-output-files branch, right?

uwefladrich commented 4 years ago

@goord : Sorry again for responding a bit slow, too many different things ongoing right now...

Anyway, with your last fixes I was able to cmorise the test data without error! A first check confirms that level data is actually ending up in the netcdf files.

Great job, thanks a lot!

I guess we'll double check that everything is fine with the data before concluding the issue.

@treerink : I have to admit I do not understand the implications right away. Do you mean that when model data is produced with recent output control files the model-level data cmorisation could again be a problem?

treerink commented 4 years ago

Hi Uwe, I think the first problem in cmorising the model levels was https://github.com/EC-Earth/ece2cmor3/issues/533#issuecomment-543048139 related. On that part things have changed, actually should work more robust now. But there has been solved more in this issue which is independent of that. Anyway, yes it would be good to checkout the latest r7144-control-output-files branch and run one year EC-Earth3 again and then test once more this branch on that output, just to be sure all works fine.

klauswyser commented 4 years ago

Sorry @treerink , we already have many runs with saved model levels that wait to be processed, and we don't plan to make any further experiments with model levels. From that perspective it doesn't help us to check out the latest output control files and redo an experiment to test that everthing works fine, for us ece2cmor has to work with the existing data.

treerink commented 4 years ago

Still it would be helpful when you just could do this 1 year test, because then we know whether the solution is robust for both situations. I can test it as well, but I don't know whether it currently should work out of the box?

klauswyser commented 4 years ago

I see. OK, I'll set up a short new experiment with the runtime env from r7144-control-output-files and then Uwe can test the cmorisation of the model level output.

treerink commented 4 years ago

Ok thanks Klaus.

I realise that to do the proper cmorisation actually the master needs to be merged in to the height_levs_filter branch first because of the ua-va-100m #540 merge, but @goord prefers to do this merge himself because of the recently merged fx_task branch in #538 which could complicate the merge.

As you at SMHI need to have this branch in its current stage, I log here its git version: 0accb180ac4ea6b32b0613d444c1c3b78f14d412 but for you it might be easier that we do the merge in a duplicate branche: height_levs_filter_update_with_master.

However the r7144-control-output-files branch is ready to run EC-Earth3 already for this.

goord commented 4 years ago

I have completed the merge of the height_levs_filter branch, so closing this one

uwefladrich commented 4 years ago

I am still testing with the branch, previous runs have crashed but due to a problem on our side. I'll come back if there are more problems.

treerink commented 4 years ago

I deleted the branch, because the master has been merged already in the height_levs_filter branch including the #540 merge.

So for the old pre #540 merge case best is to checkout git version: 0accb180ac4ea6b32b0613d444c1c3b78f14d412

uwefladrich commented 4 years ago

I am indeed testing at 0accb18 right now. But I am a bit unsure about the implications to switch to the master head at this stage. Is it supposed to work with data produced with old output control files?

treerink commented 4 years ago

Hi Uwe, you should use for your data (which is produced by control output files which are based on a genecec pre-#540-merge) version 0accb18 to cmorise that model level output.

The master can be used for cmorising model levels which are based on the control output files for the next release (now available in the r7144-control-output branch).

uwefladrich commented 4 years ago

So that means that ece2cmor is not backward compatible? This has quite some implications, hasn't it?

treerink commented 4 years ago

Indeed ece2cmor3 is not in all aspects backward compatible. For that reason we did not merge #540 in v1.2.0 because people had made their runs and there was no previous matching EC-Earth3 - ec2cmor3 release combination. For the next compatible release the #540-merge has been always planned, in fact to overcome the trouble with these grib codes and model level output.

uwefladrich commented 4 years ago

Okay, so the incompatibility comes with version number changes. That's reasonable, I guess.