NOAA-GFDL / MDTF-diagnostics

Analysis framework and collection of process-oriented diagnostics for weather and climate simulations
https://mdtf-diagnostics.readthedocs.io/en/main/
Other
62 stars 98 forks source link

Unable to run multicase example #708

Open csyhuang opened 18 hours ago

csyhuang commented 18 hours ago

I am trying to run the multicase example following 2.3-2.6 in the documentation page, but run into the error below.

There are several adjustments that I made in order to make the code run (till the point of error) - I think it would be good to include them in documentation such that new users can follow easier, so I listed them in the session Steps To Reproduce.

Please let me know if I missed anything. Thanks!


Bug Severity

Describe the bug

After making the path adjustments, I run the multicase example by the command

./mdtf -f diagnostics/example_multicase/multirun_config_template.jsonc 

And got the error AttributeError: 'DataArray' object has no attribute 'variables'. Did you mean: 'variable'?. The full output is as follows:

Preprocessing data for example_multicase
Querying /home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json for variable tas for case CMIP_Synthetic_r1i1p1f1_gr1_19800101-19841231.
WARNING: /home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/intake_esm/_search.py:50: UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.
  mask = df[column].str.contains(value, regex=True, case=True, flags=0)

Variable <tas> data ends at hour 0
Querying /home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json for variable tas for case CMIP_Synthetic_r1i1p1f1_gr1_19850101-19891231.
WARNING: /home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/intake_esm/_search.py:50: UserWarning: This pattern is interpreted as a regular expression, and has match groups. To actually get the groups, use str.extract.
  mask = df[column].str.contains(value, regex=True, case=True, flags=0)

Variable <tas> data ends at hour 0
Units for 'time' on var 'tas' found in dataset; setting to 'days since 1980-01-01'.
No calendar for 'time' found in dataset; setting to 'sentinel.NotSet'.
Converted units on <#None:example_multicase.tas>.
WARNING: /home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/user_scripts/example_pp_script.py:98: FutureWarning: the `pandas.MultiIndex` object(s) passed as 'time' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`.
  xr_dupe.coords['time'] = ind

Units for 'time' on var 'tas' found in dataset; setting to 'days since 1985-01-01'.
No calendar for 'time' found in dataset; setting to 'sentinel.NotSet'.
Converted units on <#None:example_multicase.tas>.
WARNING: /home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/user_scripts/example_pp_script.py:98: FutureWarning: the `pandas.MultiIndex` object(s) passed as 'time' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`.
  xr_dupe.coords['time'] = ind

CRITICAL: **********************************************************************
Uncaught exception:
Traceback (most recent call last):
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/preprocessor.py", line 1350, in write_ds
    ds = self.clean_output_attrs(var, ds)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/preprocessor.py", line 1281, in clean_output_attrs
    for vv in ds.variables.values():
              ^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/xarray/core/common.py", line 280, in __getattr__
    raise AttributeError(
AttributeError: 'DataArray' object has no attribute 'variables'. Did you mean: 'variable'?

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/mdtf_framework.py", line 243, in <module>
    exit_code = main(prog_name='MDTF-diagnostics')
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/miniconda3/envs/_MDTF_python3_base/lib/python3.12/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/mdtf_framework.py", line 201, in main
    data_pp.write_ds(cases, cat_subset, pod_runtime_reqs)
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/preprocessor.py", line 1353, in write_ds
    raise util.chain_exc(exc, (f"cleaning attributes to "
  File "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/util/exceptions.py", line 58, in chain_exc
    raise new_exc_class(new_msg) from exc
src.util.exceptions.DataPreprocessEvent: Caught exception while cleaning attributes to write data for <#None:example_multicase.tas>: AttributeError("'DataArray' object has no attribute 'variables'").

When I check the dataset using ncdump -h ~/Dropbox/GitHub/mdtf/inputdata/mdtf_test_data/CMIP_Synthetic_r1i1p1f1_gr1_19850101-19891231/day/CMIP_Synthetic_r1i1p1f1_gr1_19850101-19891231.tas.day.nc, I can see the time dimension being 1825, not 0.

Steps To Reproduce

In addition to following the instructions on the documentation Section 2.3-2.6, I also made the following changes to get the code running:

  1. I created the synthetic data as instructed in Section 2.3 and they are stored in /home/clare/Dropbox/GitHub/mdtf/inputdata/mdtf_test_data/.

  2. In user_scripts/example_pp_script.py, I changed the config_file to:

config_file = "/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/templates/runtime_config.jsonc"
  1. I changed the paths in diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.csv and diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.csv to point to the dataset on my machine.

  2. I changed in diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json the "catalog_file" to point to the correct .csv path.

  3. I installed the most updated environment src/conda/env_python3_base.yml. Running mdtf afterwards gave me an error like ModuleNotFoundError: No module named 'cfunits', which I fixed by installing the missing packages:

python3 -m pip install cfunits
conda install conda-forge::udunits

These are the steps I've executed before running ./mdtf -f diagnostics/example_multicase/multirun_config_template.jsonc.

Environment

I am on the main branch with all the existing commits pulled.

Executing cat /etc/os-release gives

NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
     active environment : _MDTF_python3_base
    active env location : /home/clare/miniconda3/envs/_MDTF_python3_base
            shell level : 2
       user config file : /home/clare/.condarc
 populated config files : 
          conda version : 4.10.3
    conda-build version : not installed
         python version : 3.7.6.final.0
       virtual packages : __cuda=12.0=0
                          __linux=4.15.0=0
                          __glibc=2.27=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/clare/miniconda3  (writable)
      conda av data dir : /home/clare/miniconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/clare/miniconda3/pkgs
                          /home/clare/.conda/pkgs
       envs directories : /home/clare/miniconda3/envs
                          /home/clare/.conda/envs
               platform : linux-64
             user-agent : conda/4.10.3 requests/2.28.1 CPython/3.7.6 Linux/4.15.0-214-generic ubuntu/18.04.6 glibc/2.27
                UID:GID : 1001:1001
             netrc file : None
           offline mode : False
csyhuang commented 15 hours ago

In addition, when I tried setting up MDTF environments using

./src/conda/conda_env_setup.sh --all --conda_root /home/clare/miniconda3

I encountered the following error:

info     libmamba ****************** Backtrace Start ******************
debug    libmamba Loading configuration
trace    libmamba Compute configurable 'create_base'
trace    libmamba Compute configurable 'no_env'
trace    libmamba Compute configurable 'no_rc'
trace    libmamba Compute configurable 'rc_files'
trace    libmamba Compute configurable 'root_prefix'
trace    libmamba Get RC files configuration from locations up to HomeDir
trace    libmamba Configuration not found at '/home/clare/.mambarc'
trace    libmamba Configuration not found at '/home/clare/.mamba/mambarc.d'
trace    libmamba Configuration not found at '/home/clare/.mamba/mambarc'
trace    libmamba Configuration not found at '/home/clare/.mamba/.mambarc'
trace    libmamba Configuration not found at '/home/clare/.config/mamba/mambarc.d'
trace    libmamba Configuration not found at '/home/clare/.config/mamba/mambarc'
trace    libmamba Configuration not found at '/home/clare/.config/mamba/.mambarc'
trace    libmamba Configuration not found at '/home/clare/.condarc'
trace    libmamba Configuration not found at '/home/clare/.conda/condarc.d'
trace    libmamba Configuration not found at '/home/clare/.conda/condarc'
trace    libmamba Configuration not found at '/home/clare/.conda/.condarc'
trace    libmamba Configuration not found at '/home/clare/.config/conda/condarc.d'
trace    libmamba Configuration not found at '/home/clare/.config/conda/condarc'
trace    libmamba Configuration not found at '/home/clare/.config/conda/.condarc'
trace    libmamba Configuration not found at '/home/clare/miniconda3/envs/_MDTF_install_temp/.mambarc'
trace    libmamba Configuration not found at '/home/clare/miniconda3/envs/_MDTF_install_temp/condarc.d'
trace    libmamba Configuration not found at '/home/clare/miniconda3/envs/_MDTF_install_temp/condarc'
trace    libmamba Configuration not found at '/home/clare/miniconda3/envs/_MDTF_install_temp/.condarc'
trace    libmamba Configuration not found at '/var/lib/conda/.mambarc'
trace    libmamba Configuration not found at '/var/lib/conda/condarc.d/'
trace    libmamba Configuration not found at '/var/lib/conda/condarc'
trace    libmamba Configuration not found at '/var/lib/conda/.condarc'
trace    libmamba Configuration not found at '/etc/conda/.mambarc'
trace    libmamba Configuration not found at '/etc/conda/condarc.d/'
trace    libmamba Configuration not found at '/etc/conda/condarc'
trace    libmamba Configuration not found at '/etc/conda/.condarc'
trace    libmamba Update configurable 'no_env'
trace    libmamba Compute configurable 'envs_dirs'
trace    libmamba Compute configurable 'file_specs'
error    libmamba YAML spec file '=/home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/conda/env_base_micromamba.yml' not found
critical libmamba File not found. Aborting.
info     libmamba ****************** Backtrace End ********************

I checked that the /home/clare/Dropbox/GitHub/mdtf/MDTF-diagnostics/src/conda/env_base_micromamba.yml exists, though.

wrongkindofdoctor commented 41 minutes ago

@csyhuang For the miniconda issue, you need to specify --env_dir in the call (e.g., --env_dir /home/clare/miniconda3/envs). If the install still fails, try installing one environment at a time with the -e parameter instead of --all (e.g., -e python3_base). I'll try out the example_multicase on my machine and see if I can replicate the issue.