PCMDI / pcmdi_metrics

Open-source Python package for Systematic Evaluation of Climate and Earth System Models
http://pcmdi.github.io/pcmdi_metrics/
BSD 3-Clause "New" or "Revised" License
98 stars 37 forks source link

Vertical level extraction issue #975

Closed lee1043 closed 1 year ago

lee1043 commented 1 year ago

Vertical level extraction does not work properly when given vertical coordinate is hybrid coord (often named as lev), instead of pressure lev (plev). The mean climate annual cycle calculation code extract a layer that is incorrect, which influence statistics, making the erroneous model as outlier. Rather, such models should be excluded (short-term) or corrected (in long-term) for the summary statistics.

mean_clim_parallel_coordinate_plot_4seasons_cmip56

In the above plot, for ta-850: HadGEM3-GC31-LL, HadGEM3-GC31-MM, and UKESM1-0-LL are marked at the top of the axis, showing rms_xyt > 90, which amplifies distribution of statistics from CMIP6 models.

Below is a part of ncdump result for UKESM1-0-LL nc file.

    double lev(lev) ;
        lev:bounds = "lev_bnds" ;
        lev:units = "m" ;
        lev:axis = "Z" ;
        lev:positive = "up" ;
        lev:long_name = "hybrid height coordinate" ;
        lev:standard_name = "atmosphere_hybrid_height_coordinate" ;
        lev:formula = "z = a + b*orog" ;
        lev:formula_terms = "a: lev b: b orog: orog" ;

    float ta(time, lev, lat, lon) ;
        ta:standard_name = "air_temperature" ;
        ta:long_name = "Air Temperature" ;
        ta:comment = "Air Temperature" ;
        ta:units = "K" ;
        ta:original_name = "mo: (stash: m01s30i111, lbproc: 128)" ;
        ta:cell_methods = "area: time: mean" ;
        ta:cell_measures = "area: areacella" ;
        ta:missing_value = 1.e+20f ;
        ta:_FillValue = 1.e+20f ;

While other models:

    double plev(plev) ;
        plev:units = "Pa" ;
        plev:axis = "Z" ;
        plev:positive = "down" ;
        plev:long_name = "pressure" ;
        plev:standard_name = "air_pressure" ;

    float ta(time, plev, lat, lon) ;
        ta:standard_name = "air_temperature" ;
        ta:long_name = "Air Temperature" ;
        ta:comment = "Air Temperature" ;
        ta:units = "K" ;
        ta:original_name = "mo: (stash: m01s30i294, blev: [1000.0, 925.0, 850.0, 700.0, 600.0, 500.0, 400.0, 300.0, 250.0, 200.0, 15
0.0, 100.0, 70.0, 50.0, 30.0, 20.0, 10.0, 5.0, 1.0], lbproc: 128) / (stash: m01s30i304, blev: [1000.0, 925.0, 850.0, 700.0, 600.0, 500.0, 40
0.0, 300.0, 250.0, 200.0, 150.0, 100.0, 70.0, 50.0, 30.0, 20.0, 10.0, 5.0, 1.0], lbproc: 128)" ;
        ta:cell_methods = "time: mean" ;
        ta:cell_measures = "area: areacella" ;
        ta:missing_value = 1.e+20f ;
        ta:_FillValue = 1.e+20f ;
        ta:history = "2022-05-12T23:53:56Z altered by CMOR: Inverted axis: plev." ;

Action item

Short term fix: Exclude those models

Long term goal: apply vertical interpolation if needed

lee1043 commented 1 year ago

Separate but related issue:

When plev is like below,

 plev = 100000.00000001, 92500.00000001, 85000.00000001, 70000.00000001, 
    60000.00000001, 50000.00000001, 40000.00000001, 30000.00000001, 
    25000.00000001, 20000.00000001, 15000.00000001, 10000.00000001, 
    7000.00000001, 5000.00000001, 3000.00000001, 2000.00000001, 
    1000.00000001, 500.00000001, 100.00000001 ;
}

ds.sel(plev=85000) from load_and_regrid.py

returns the following error:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3653, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1698, in pandas._libs.hashtable.Float64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1722, in pandas._libs.hashtable.Float64HashTable.get_item
KeyError: 85000.0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexes.py", line 769, in sel
    indexer = self.index.get_loc(label_value)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3655, in get_loc
    raise KeyError(key) from err
KeyError: 85000.0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/bin/mean_climate_driver.py", line 237, in <module>
    ds_test = load_and_regrid(data_path=test_data_full_path, varname=varname, varname_in_file=varname_testdata, level=level, t_grid=t_grid, decode_times=True, regrid_tool=regrid_tool, debug=debug)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/pcmdi_metrics/mean_climate/lib/load_and_regrid.py", line 61, in load_and_regrid
    ds = ds.sel(plev=level)
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/dataset.py", line 3020, in sel
    query_results = map_index_queries(
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexing.py", line 190, in map_index_queries
    results.append(index.sel(labels, **options))
  File "/home/lee1043/.conda/envs/pcmdi_metrics_dev_20230822/lib/python3.10/site-packages/xarray/core/indexes.py", line 771, in sel
    raise KeyError(
KeyError: "not all values found in index 'plev'. Try setting the `method` keyword argument (example: method='nearest')."

Cause of error: given key (85000) is not exactly matching with key in the file (85000.00000001)

Workaround: Add method='nearest' to ds.sel(lev=85000). However this can cause error if plev does not have layer for 850 hPa and nearest layer is far below or above (e.g., 700 or 500 hPa).