E3SM-Project / e3sm_to_cmip

Tools to CMORize E3SM output
https://e3sm-to-cmip.readthedocs.io/en/latest/
MIT License
7 stars 7 forks source link

[Bug]: e2c info mode fails on table day variables #248

Closed TonyB9000 closed 5 months ago

TonyB9000 commented 8 months ago

What happened?

Attempting to run "e3sm_to_cmip --info" for for table "day", the error "cannot import name 'get_levgrnd_bnds' from 'e3sm_to_cmip.util'" is generated.

EDIT: Same error for atmos variables - on v2.1 datasets.

Environment: Datasm (conda-env/prod.yml) with latest e2c "pip install" (version 1.11.2rc2).

What did you expect to happen? Are there are possible answers you came across?

process should return a temp-named yaml-formatted file with info for the selected variable.

Minimal Complete Verifiable Example (MVCE)

The script:

/p/user_pub/e3sm/bartoletti1/Operations/5_DatasetGeneration/Ops1/run_test_script.sh

contains the single commandline:

e3sm_to_cmip --info --map none -i /p/user_pub/e3sm/warehouse/E3SM/2_1/1pctCO2/LR/atmos/native/model-output/day/ens1/v0 -o <any_dir> -u /p/user_pub/e3sm/bartoletti1/Operations/5_DatasetGeneration/Ops1/slurm_scripts/1pctCO2_r1i1p1f1.json --freq day -v huss -t /p/user_pub/e3sm/staging/resource/cmor/cmip6-cmor-tables/Tables/ --info-out <any_temp_file> --realm atm

will reproduce the error, given environmental constraints.

Relevant log output

Traceback (most recent call last):
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/bin/e3sm_to_cmip", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/__main__.py", line 964, in main
    app = E3SMtoCMIP(args)
          ^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/__main__.py", line 154, in __init__
    self.handlers = self._get_handlers()
                    ^^^^^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/__main__.py", line 204, in _get_handlers
    handlers = load_all_handlers(self.realm, self.var_list)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/utils.py", line 58, in load_all_handlers
    handlers_by_var: Dict[str, List[Dict[str, Any]]] = _get_handlers_by_var()
                                                       ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/utils.py", line 269, in _get_handlers_by_var
    handlers_from_modules = _get_handlers_from_modules(LEGACY_HANDLER_DIR_PATH)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/utils.py", line 365, in _get_handlers_from_modules
    module = _get_handler_module(var, filepath)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/utils.py", line 404, in _get_handler_module
    module = SourceFileLoader(module_name, module_path).load_module()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap_external>", line 649, in _check_name_wrapper
  File "<frozen importlib._bootstrap_external>", line 1176, in load_module
  File "<frozen importlib._bootstrap_external>", line 1000, in load_module
  File "<frozen importlib._bootstrap>", line 537, in _load_module_shim
  File "<frozen importlib._bootstrap>", line 966, in _load
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/vars/clcalipso.py", line 8, in <module>
    from e3sm_to_cmip.lib import handle_variables
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/lib.py", line 23, in <module>
    from e3sm_to_cmip.util import (
ImportError: cannot import name 'get_levgrnd_bnds' from 'e3sm_to_cmip.util' (/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/util.py)

Anything else we need to know?

No response

Environment

 active environment : dsm_prod_local_e2c_0304
active env location : /home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304
        shell level : 2
   user config file : /home/bartoletti1/.condarc

populated config files : /home/bartoletti1/mambaforge/.condarc conda version : 22.9.0 conda-build version : not installed python version : 3.10.6.final.0 virtual packages : linux=3.10.0=0 glibc=2.17=0 unix=0=0 archspec=1=x86_64 base environment : /home/bartoletti1/mambaforge (writable) conda av data dir : /home/bartoletti1/mambaforge/etc/conda conda av metadata url : None channel URLs : https://conda.anaconda.org/conda-forge/linux-64 https://conda.anaconda.org/conda-forge/noarch package cache : /home/bartoletti1/mambaforge/pkgs /home/bartoletti1/.conda/pkgs envs directories : /home/bartoletti1/mambaforge/envs /home/bartoletti1/.conda/envs platform : linux-64 user-agent : conda/22.9.0 requests/2.28.1 CPython/3.10.6 Linux/3.10.0-1160.108.1.el7.x86_64 rhel/7.9 glibc/2.17 UID:GID : 61843:4061 netrc file : None offline mode : False

tomvothecoder commented 8 months ago

It looks like an old version of e3sm_to_cmip is still installed and trying to reference removed modules (e.g., clcalipso.py and lib.py) in e3sm_to_cmip>=1.11.0. If you install a stable version with conda and then overwrite it with pip install ., it might cause issues (not 100% sure).

  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/cmor_handlers/vars/clcalipso.py", line 8, in <module>
    from e3sm_to_cmip.lib import handle_variables
  File "/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/lib.py", line 23, in <module>
    from e3sm_to_cmip.util import (
ImportError: cannot import name 'get_levgrnd_bnds' from 'e3sm_to_cmip.util' (/home/bartoletti1/mambaforge/envs/dsm_prod_local_e2c_0304/lib/python3.12/site-packages/e3sm_to_cmip/util.py)
  1. Completely remove e3sm_to_cmip from your datasm env
  2. Use conda-forge and the e3sm_to_cmip_dev channel to install e3sm_to_cmip=1.11.2.rc2
    • mamba install -c conda-forge/label/e3sm_to_cmip_dev e3sm_to_cmip=1.11.2.rc2
  3. Activate environment and open python
  4. import e3sm_to_cmip and e3sm_to_cmip.__version__.
TonyB9000 commented 8 months ago

Thanks Tom. I am following your instructions, beginning with a fresh (datasm) environment, then removing the e3sm_to_cmip before mamba-installing the version-specific e2c. When you say (3) "open python", I assume you mean an interactive python shell. If I then "import e3sm_to_cmip" and "e3sm_to_cmip.__version__", is this just to test the version? These won't persist after I exit the python shell.

Side Question on Conda Wisdom: I've learned one can "clone" a conda environment, by capturing its content with conda env export > spec.yml and then conda env create -n newname -f spec.yml. Ordinarily, when testing changes to datasm, I simply (re) "pip install" the changed datasm app, over and over. Would it be wiser to "clone" the raw datasm (-f datasm/conda-env/prod.yml) prior to ANY "pip install " essentially starting from scratch for each code change being tested?

(Also, I want to test whether "cloning" a known-clean environment is faster, and avoids the "Solving environment:" that can take an hour ...)

TonyB9000 commented 8 months ago

BTW, I could not use "mamba install" (may need to reinstall mamba, get the same error libmamba Could not set lock error.) And when I use conda install -c conda-forge/label/e3sm_to_cmip_dev e3sm_to_cmip=1.11.2.rc2 I get "not found":

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - e3sm_to_cmip=1.11.2.rc2

Current channels:

  - https://conda.anaconda.org/conda-forge/label/e3sm_to_cmip_dev/linux-64
  - https://conda.anaconda.org/conda-forge/label/e3sm_to_cmip_dev/noarch
  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch

Is install ... <package>=<version> strictly a "mamba" thing?

tomvothecoder commented 8 months ago

When you say (3) "open python", I assume you mean an interactive python shell. Correct, run python on terminal.

If I then "import e3sm_to_cmip" and "e3sm_to_cmip.__version__", is this just to test the version? These won't persist after I exit the python

Correct, and also to check if the import error no longer happens with the e3sm_to_cmip=1.11.2rc2 installed.

Side Question on Conda Wisdom: I've learned one can "clone" a conda environment, by capturing its content with conda env export > spec.yml and then conda env create -n newname -f spec.yml.

I don't recommend using conda env export > spec.yml because it captures ALL dependencies (including the millions of sub-dependencies) with exact pinned versions. The datasm/conda-env/prod.yml captures the directly dependencies, and we let conda/mamba handle which versions of dependencies (and sub-dependencies) to install. We usually define some loose constraints on .yml files (e.g., python >= 3.10).

Ordinarily, when testing changes to datasm, I simply (re) "pip install" the changed datasm app, over and over. Would it be wiser to "clone" the raw datasm (-f datasm/conda-env/prod.yml) prior to ANY "pip install " essentially starting from scratch for each code change being tested?

(Also, I want to test whether "cloning" a known-clean environment is faster, and avoids the "Solving environment:" that can take an hour ...)

If you are working on datasm then pip install . is recommended to get the latest changes into your prod.yml environment. No need to recreate the environment every time to try out development changes or run production changes.

conda install -c conda-forge/label/e3sm_to_cmip_dev e3sm_to_cmip=1.11.2.rc2 I get "not found":

There is a typo here: e3sm_to_cmip=1.11.2.rc2. It should be e3sm_to_cmip=1.11.2rc2.

Try conda install -c conda-forge/label/e3sm_to_cmip_dev e3sm_to_cmip=1.11.2rc2 (or with mamba if you get it working).

Is install ... = strictly a "mamba" thing?

conda and mamba are interoperable and the commands are aligned.

TonyB9000 commented 8 months ago

"There is a typo here: e3sm_to_cmip=1.11.2.rc2. It should be e3sm_to_cmip=1.11.2rc2`."

OK! That I can deal with :)

BTW, I was able (finally) to overcome the "libmamba/cannot_lock_resource" issue by doing a conda update of mamba. Now I can "mamba env create", which is WAY faster than "conda env create".

TonyB9000 commented 8 months ago

Confirmation:

>>> import e3sm_to_cmip
>>> e3sm_to_cmip.__version__
'1.11.2rc2'

Now, on to full dataset generation testing!

TonyB9000 commented 5 months ago

Having long since updated to v1.11.2rc2 and successfully generated all v2_1 CMIP6 "day" datasets, I will deem this issue resolved and will close.