E3SM-Project / e3sm_diags

E3SM Diagnostics package
https://e3sm-project.github.io/e3sm_diags
BSD 3-Clause "New" or "Revised" License
37 stars 30 forks source link

[Bug]: Regridding time-series variable's climatology from hybrid to pressure levels breaks due to misaligned axes #721

Open tomvothecoder opened 10 months ago

tomvothecoder commented 10 months ago

In PR #677, I am doing a complete run of ex1.py to make sure all of the metrics and plots are successfully generated. The last remaining issue is an xgcm error that is raised with the variables "U", "V", "OMEGA", and "Z3".

Traceback (most recent call last):
   File "<string>", line 1, in <module>
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xcdat/regridder/accessor.py", line 389, in vertical
     output_ds = regridder.vertical(data_var, self._ds)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xcdat/regridder/xgcm.py", line 204, in vertical
     output_da = grid.transform(
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 2692, in transform
     return ax.transform(da, target, **kwargs)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1077, in transform
     target, target_dim, target_data = _parse_target(
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1072, in _parse_target
     _check_other_dims(target_data)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1038, in _check_other_dims
     raise ValueError(
 ValueError: Found additional dimensions [{'time'}]in `target_data` not found in `da`. This could mean that the target array is not on the same position along other axes. If the additional dimensions are associated witha staggered axis, use grid.interp() to move values to other grid position. If additional dimensions are not related to the grid (e.g. climate model ensemble members or similar), use xr.broadcast() before using transform.

Overview

I switched back to the main branch to see how the original code was handling this situation and noticed it breaks there too.

The lat_lon_driver.py tries to regrid the variables "U", "V", "OMEGA", and "Z3" using pressure levels reconstructed from hybrid. The issue is that these variables have already been regridded to 19 pressure levels (source) and no longer have a time axis. Since the variable and pressure level variable (has time axis) do not have matching dimensions, regridding does not work on the variable.

This error is raised: cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36.

Related code: https://github.com/E3SM-Project/e3sm_diags/blob/6aaa7c27dcf65393dd07e258ef2e7da21a660e87/e3sm_diags/driver/utils/general.py#L132-L146

Possible Solutions

From this comment:

We need the axes for the climatology variable and the hybrid variables to align. This means we should calculate the climatologies for the hybrid variables beforehand.

All variables fed into convert_to_pressure_levels(), hybrid_to_plevs() and pressure_to_plevs() should not have a time axis.

Steps to Reproduce (NERSC machine)

  1. Checkout the main branch
  2. Create and activate the dev environment
     mamba env create -f conda/dev.yml
     mamba activate e3sm_diags_dev
  3. Update lat_lon_model_vs_model.cfg to only run one of related variables (e.g., "U")
    [#]
    sets = ["lat_lon"]
    case_id = "model_vs_model"
    variables = ["U"]
    seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
    plevs = [850.0]
    test_colormap = "PiYG_r"
    reference_colormap = "PiYG_r"
    contour_levels = [-20, -15, -10, -8, -5, -3, -1, 1, 3, 5, 8, 10, 15, 20]
    diff_levels = [-15, -10, -8, -6, -4, -2, -1, 1, 2, 4, 6, 8, 10, 15]
    regrid_method = "bilinear"
  4. Install e3sm_diags into the dev environment
     python -m pip install .
  5. Update result paths and run ex1.py (attached below for convenience)

    import os
    
    from e3sm_diags.parameter.core_parameter import CoreParameter
    from e3sm_diags.run import runner
    
    param = CoreParameter()
    # Location of the data.
    param.test_data_path = "/global/cfs/cdirs/e3sm/e3sm_diags/test_model_data_for_acme_diags/time-series/E3SM_v1"
    param.reference_data_path = "/global/cfs/cdirs/e3sm/e3sm_diags/test_model_data_for_acme_diags/time-series/E3SM_v1"
    
    # Set this parameter to True.
    # By default, e3sm_diags expects the test data to be climo data.
    param.test_timeseries_input = True
    # Years to slice the test data, base this off the years in the filenames.
    param.test_start_yr = "2011"
    param.test_end_yr = "2013"
    
    # Set this parameter to True.
    # By default, e3sm_diags expects the ref data to be climo data.
    param.ref_timeseries_input = True
    # Years to slice the ref data, base this off the years in the filenames.
    param.ref_start_yr = "1850"
    param.ref_end_yr = "1852"
    
    # When running with time-series data, you don't need to specify the name of the data.
    # But you should, otherwise nothing is displayed when the test/ref name is needed.
    param.short_test_name = "historical_H1"
    param.short_ref_name = "historical_H1"
    
    # This parameter modifies the software to accommodate model vs model runs.
    # The default setting for run_type is 'model_vs_obs'.
    param.run_type = "model_vs_model"
    # Name of the folder where the results are stored.
    # Change `prefix` to use your directory.
    prefix = "/global/cfs/cdirs/e3sm/www/vo13/examples"
    param.results_dir = os.path.join(prefix, "ex1_modTS_vs_modTS_3years")
    
    # Below are more optional arguments.
    
    # What plotsets to run the diags on.rm -rf
    # If not defined, then all available sets are used.
    param.sets = ["lat_lon"]
    # What seasons to run the diags on.
    # If not defined, diags are run on ['ANN', 'DJF', 'MAM', 'JJA', 'SON'].
    param.seasons = ["ANN"]
    # Title of the difference plots.
    param.diff_title = "Model (2011-2013) - Model (1850-1852)"
    # For running with multiprocessing.
    param.multiprocessing = False
    # param.num_workers = 32
    
    runner.sets_to_run = ["lat_lon"]
    runner.run_diags([param])

Traceback:

2023-08-24 10:36:27,981 [INFO]: e3sm_diags_driver.py(_save_env_yml:59) >> Saved environment yml file to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/environment.yml](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/environment.yml)
2023-08-24 10:36:27,983 [INFO]: e3sm_diags_driver.py(_save_parameter_files:70) >> Saved command used to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/cmd_used.txt](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/cmd_used.txt)
2023-08-24 10:36:27,985 [INFO]: e3sm_diags_driver.py(_save_python_script:134) >> Saved Python script to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/ipykernel_launcher.py](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/ipykernel_launcher.py)
2023-08-24 10:36:28,350 [INFO]: lat_lon_driver.py(run_diag:146) >> Variable: U
2023-08-24 10:36:52,883 [INFO]: lat_lon_driver.py(run_diag:167) >> Selected pressure level: [850.0]
2023-08-24 10:37:14,668 [ERROR]: e3sm_diags_driver.py(run_diag:296) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/e3sm_diags_driver.py", line 293, in run_diag
    single_result = module.run_diag(parameter)
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/lat_lon_driver.py", line 169, in run_diag
    mv1_p = utils.general.convert_to_pressure_levels(
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/utils/general.py", line 117, in convert_to_pressure_levels
    mv_p = hybrid_to_plevs(mv, hyam, hybm, ps, plevs)
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/utils/general.py", line 142, in hybrid_to_plevs
    var_p = cdutil.vertical.logLinearInterpolation(
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdutil-8.2.1-py3.11.egg/cdutil/vertical.py", line 278, in logLinearInterpolation
    t.setAxisList(ax)
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdms2/tvariable.py", line 527, in setAxisList
    self.setAxis(i, axislist[i])
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdms2/tvariable.py", line 517, in setAxis
    raise CDMSError(
cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36
tomvothecoder commented 10 months ago

I think what is happening is that the variable's climatology is being calculated with "ANN", which collapses the time dimension.

In hybrid_to_plevs(), there is no call to genutil.grower() to align the axes:

In line 158 of pressure_to_plevs(), there is a call to genutil.grower() to align the axes: https://github.com/E3SM-Project/e3sm_diags/blob/6aaa7c27dcf65393dd07e258ef2e7da21a660e87/e3sm_diags/driver/utils/general.py#L149-L171

The fix is to update hybrid_to_plevs() with a call to genutil.grower() in the main branch and xr.broadcast() (or something similar) in #677. I tested this and the hybrid_to_plevs() function now works.

UPDATE: Root cause and solution below in this comment.


Call stack:

  1. Get the variables: "ANN" climatology -- collapses time dimension

  2. Convert either hybrid or pressure to pressure levels

  3. Convert hybrid to pressure levels:

  4. Raises cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36

tomvothecoder commented 10 months ago

Additional Findings

Root Cause

Using time series files like in ex1.py results in this issue because:

  1. Calculating the climatology variable modifies the time axis (collapse or reduce number of points based on type of climo)
  2. The pressure variable uses the hybrid variables, which still use the original time axis
  3. The climatology variable and the pressure variable have misaligned axes, resulting in the cdms2 regridding error.

Possible Solutions

We need the axes for the climatology variable and the hybrid variables to align. This means we should calculate the climatologies for the hybrid variables beforehand.

All variables fed into convert_to_pressure_levels(), hybrid_to_plevs() and pressure_to_plevs() should not have a time axis.