esm-tools / esm_tools

Simple Infrastructure for Earth System Simulations
https://esm-tools.github.io/
GNU General Public License v2.0
26 stars 12 forks source link

PISM couple in job fails on Ollie #758

Closed ackerlar closed 1 year ago

ackerlar commented 2 years ago

Describe the bug The "couple in" for PISM fails on Ollie with this error:

Output written by slurm:
Module for cmake version 3.22.0 loaded
Module for udunits version 2.2.26 loaded
Module for grib_api version 1.28.0 loaded.
Module for Intel Compiler and Libraries, ifort, icc, icpc; mkl, ipp, tbb,... version 2018.5.274 loaded.
Module for hdf5 version 1.10.2_gnu loaded
Module for centoslibs version 7.5 loaded.
Module for cdo version 2.0.5 loaded
Module for nco version 4.7.7 loaded
Please add >module load centoslibs<, if you are on a compute node to get all libraries.
Module for netcdf version 4.6.2_intel_18.0.5 loaded
Module for automake version 1.16.2 loaded
Module for Python 3.7 from Intel Parallel Studio 2020.2.902 loaded
Module for git, git version 2.13.1 loaded.
Currently Loaded Modulefiles:
 1) cmake/3.22.0     3) gribapi/1.28.0                   5) hdf5/1.10.2_gnu           7) cdo/2.0.5(default)   9) netcdf/4.6.2_intel  11) python3/3.7.7_intel2020u2(default)  
 2) udunits/2.2.26   4) intel.compiler/18.0.5(default)   6) centoslibs/7.5(default)   8) nco/4.7.7(default)  10) automake/1.16.2     12) git/2.13.1                          
Module for impi,  version 2018.4.274 loaded.
ERROR: Directory '~dbarbi/modulefiles' not found
Module for Intel Compiler and Libraries, ifort, icc, icpc; mkl, ipp, tbb,... version 2017.1.132 loaded.
Module for impi,  version 2017 loaded.
Module for Python 2.7 from Intel Parallel Studio 2017 loaded
For osgeo, please add: module load gdal/2.1.1
Module for pism_externals version 0.7.x_intel_impi loaded
Module for proj version 5.1.0 loaded
Fatal Python error: initfsencoding: Unable to get the locale encoding
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00002b3bec91c540 (most recent call first):
./115.25ka_3koffset_couple_in_506400101-506491231.sad: line 108: 189357 Aborted                 (core dumped) esm_runscripts awiesm_pism.yaml -e 115.25ka_3koffset -t observe_couple_in -p ${process} -s 506400101 -r 103 -v --open-run

I am not sure since when "~dbarbi/modulefiles" are not available anymore. Maybe it is related to this?

I have run this setup for several 100 years and didn't change anything in the esm_tools or runscripts.

System (please complete the following information):

(base) [lackerma@ollie1:~]$ esm_versions check
+---------------------+-----------+---------------------------------------------------------+----------------------+----------------------------+
| package_name        | version   | file                                                    | branch               | tags                       |
|---------------------+-----------+---------------------------------------------------------+----------------------+----------------------------|
| esm_calendar        | 5.0.0     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_database        | 5.0.0     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_environment     | 5.1.3     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_master          | 5.1.6     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_motd            | 5.0.2     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_parser          | 5.1.12    | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_plugin_manager  | 5.0.1     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_profile         | 5.0.0     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_rcfile          | 5.1.0     | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_runscripts      | 5.1.37    | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
| esm_tools           | 5.1.17    | /home/ollie/lackerma/SOFTWARE/esm-tools/esm_tools       | feat/awiesm-2.2_dual | v5.1.17-142-g7989d2b-dirty |
| esm_version_checker | 5.1.13    | /home/ollie/lackerma/.local/lib/python3.9/site-packages |                      |                            |
+---------------------+-----------+---------------------------------------------------------+----------------------+----------------------------+

Here is the path to the experiment: /work/ollie/lackerma/PalModII/experiments/115.25ka_3koffset

EDIT: pism standalone runs but stops after each year without any clear reason (at least for me): /work/ollie/lackerma/PalModII/experiments/pism_standalone_125ka/log/pism_standalone_125ka_pism_compute_422000101-422091231.log

nwieters commented 1 year ago

@ackerlar is this still an issue? Have you tried to run such an experiment on another machine?

ackerlar commented 1 year ago

It can be closed