esm-tools / esm_tools

Simple Infrastructure for Earth System Simulations
https://esm-tools.github.io/
GNU General Public License v2.0
25 stars 12 forks source link

crash when trying to find vcs control information #782

Closed seb-wahl closed 2 years ago

seb-wahl commented 2 years ago

Describe the bug When running FOCI, I get the following error at the end of the first year:

================================================================================
::: Executing the step:  add_vcs_info    (step [13/16] of the job:  prepcompute)
================================================================================

===================================================================================================
::: Executing the step:  check_vcs_info_against_last_run    (step [14/16] of the job:  prepcompute)
===================================================================================================
Traceback (most recent call last):
  File "/gxfs_home/geomar/smomw235/.local/bin/esm_runscripts", line 33, in <module>
    sys.exit(load_entry_point('esm-tools', 'console_scripts', 'esm_runscripts')())
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/cli.py", line 278, in main
    Setup()
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/sim_objects.py", line 90, in __call__
    resubmit.maybe_resubmit(self.config)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 148, in maybe_resubmit
    nextrun = resubmit_recursively(config, jobtype=jobtype)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 189, in resubmit_recursively
    resubmit_SimulationSetup(config, cluster)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 66, in resubmit_SimulationSetup
    config["general"]["experiment_over"] = cluster_obj(kill_after_submit=False)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/sim_objects.py", line 90, in __call__
    resubmit.maybe_resubmit(self.config)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 164, in maybe_resubmit
    nextrun = resubmit_recursively(
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 189, in resubmit_recursively
    resubmit_SimulationSetup(config, cluster)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/resubmit.py", line 66, in resubmit_SimulationSetup
    config["general"]["experiment_over"] = cluster_obj(kill_after_submit=False)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/sim_objects.py", line 69, in __call__
    self.prepcompute()
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/sim_objects.py", line 161, in prepcompute
    self.config = prepcompute.run_job(self.config)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/prepcompute.py", line 32, in run_job
    config = evaluate(config, "prepcompute", "prepcompute_recipe")
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/helpers.py", line 69, in evaluate
    config = esm_plugin_manager.work_through_recipe(
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_plugin_manager/esm_plugin_manager.py", line 141, in work_through_recipe
    config = getattr(submodule, workitem)(config)
  File "/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/prepare.py", line 755, in check_vcs_info_against_last_run
    with open(last_exp_vcs_info_file, "r") as f:
"/gxfs_home/geomar/smomw235/esm/esm_tools/src/esm_runscripts/prepare.py"FileNotFoundError: [Errno 2] No such file or directory: "{'NONE_YET': {}}/FOCI3.0-SW004_vcs_info.yaml"

To Reproduce

esm_master install-foci-default # if you haven't done this already
cd runscripts/foci/
esm_runscripts -e quicktest foci-initial-piCtl_daily_restart_lowcpu.yaml

Expected behavior Well, no crash

System (please complete the following information):

Additional context I think this commit broke it:

commit 3853721ecd110d955722ea2e2332c9d606fb487f
Author: Paul Gierz <pgierz@awi.de>
Date:   Wed Jul 13 13:09:23 2022 +0200

    fix: ensure detached head mode works for awicm3

commit d862c2645f517c74f4f79ab2e3326a569a96366c
Author: mandresm <miguel.andres-martinez@awi.de>
Date:   Wed Jul 13 10:24:04 2022 +0200

    commenting this time all vcs steps

because after this commit the vcs stuff is activated again in configs/esm_software/esm_runscripts/esm_runscripts.yaml. To me it looks like the vcs yaml stuff seems to have a bug.

Workaround comment the lines

                       - "add_vcs_info"
                        - "check_vcs_info_against_last_run" 

in "configs/esm_software/esm_runscripts/esm_runscripts.yaml"

fernandadialzira commented 2 years ago

I also commented these lines as I created an issue/discussion on the same topic (#778), and it still complains about the vcs files, and the simulations crash.

pgierz commented 2 years ago

Hi @seb-wahl, I'm on support today, plus it's my feature that seem to be causing all the headaches :-( Sorry for that.

I don't have access to /gxfs_home/ (probably HLRN?), so I need a copy of your runscript to understand what is happening. Two things to help me clarify: