Open penguian opened 8 months ago
I then tried Martin Dix's fix as per https://github.com/coecms/access-esm/commit/0f769ae4338005f8ed3c6ce6478e004842d4a598
[pcl851@gadi-login-01 access-esm]$ cp -a ../../MartinDix/access-esm/config.yaml .
[pcl851@gadi-login-01 access-esm]$ payu sweep
laboratory path: /scratch/tm70/pcl851/access-esm
binary path: /scratch/tm70/pcl851/access-esm/bin
input path: /scratch/tm70/pcl851/access-esm/input
work path: /scratch/tm70/pcl851/access-esm/work
archive path: /scratch/tm70/pcl851/access-esm/archive
Moving log pre-industrial.e108246795
Moving log pre-industrial.o108246795
Removing work path /scratch/tm70/pcl851/access-esm/work/access-esm
Removing symlink /g/data/tm70/pcl851/src/coecms/access-esm/work
[pcl851@gadi-login-01 access-esm]$ payu setup
laboratory path: /scratch/tm70/pcl851/access-esm
binary path: /scratch/tm70/pcl851/access-esm/bin
input path: /scratch/tm70/pcl851/access-esm/input
work path: /scratch/tm70/pcl851/access-esm/work
archive path: /scratch/tm70/pcl851/access-esm/archive
Loading input manifest: manifests/input.yaml
Loading restart manifest: manifests/restart.yaml
Loading exe manifest: manifests/exe.yaml
Setting up atmosphere
Setting up ocean
Setting up ice
Setting up coupler
Checking exe and input manifests
Updating full hashes for 3 files in manifests/exe.yaml
File no longer in input directory: work/atmosphere/INPUT/pre-industrial.astart removing from manifest
Creating restart manifest
Updating full hashes for 51 files in manifests/restart.yaml
Writing manifests/input.yaml
Writing manifests/restart.yaml
Writing manifests/exe.yaml
[...]
[pcl851@gadi-login-01 access-esm]$ git diff
diff --git a/config.yaml b/config.yaml
index 85b1fb8..6d3935f 100644
--- a/config.yaml
+++ b/config.yaml
@@ -13,7 +13,6 @@ submodels:
exe: /g/data/access/payu/access-esm/bin/coe/um7.3x
input:
- /g/data/access/payu/access-esm/input/pre-industrial/atmosphere
- - /g/data/access/payu/access-esm/input/pre-industrial/start_dump
- name: ocean
model: mom
@@ -41,7 +40,7 @@ collate:
restart: true
mem: 4GB
-restart: /g/data/access/payu/access-esm/restart/pre-industrial
+restart: /g/data/vk83/experiments/inputs/access-esm1p5/pre-industrial/restart/
calendar:
start:
diff --git a/manifests/input.yaml b/manifests/input.yaml
index aeb3cc5..83e15b7 100644
--- a/manifests/input.yaml
+++ b/manifests/input.yaml
[...]
diff --git a/manifests/restart.yaml b/manifests/restart.yaml
index 01dc344..bd9bc55 100644
--- a/manifests/restart.yaml
+++ b/manifests/restart.yaml
[...]
[pcl851@gadi-login-01 access-esm]$ diff -rqb ../../MartinDix/access-esm .|grep -v ".git"|grep differ
Files ../../MartinDix/access-esm/atmosphere/__pycache__/um_env.cpython-310.pyc and ./atmosphere/__pycache__/um_env.cpython-310.pyc differ
[pcl851@gadi-login-01 access-esm]$ payu run -f
Loading input manifest: manifests/input.yaml
Loading restart manifest: manifests/restart.yaml
Loading exe manifest: manifests/exe.yaml
payu: Found modules in /opt/Modules/v4.3.0
qsub -q normal -P tm70 -l walltime=11400 -l ncpus=384 -l mem=1536GB -N pre-industrial -l wd -j n -v PAYU_PATH=/g/data/hh5/public/apps/miniconda3/envs/analysis3-23.07/bin,PAYU_FORCE=True,MODULESHOME=/opt/Modules/v4.3.0,MODULES_CMD=/opt/Modules/v4.3.0/libexec/modulecmd.tcl,MODULEPATH=/g/data/access/modules:/g/data/hh5/public/modules:/etc/scl/modulefiles:/apps/Modules/restricted-modulefiles/matlab_anu:/opt/Modules/modulefiles:/opt/Modules/v4.3.0/modulefiles:/apps/Modules/modulefiles -W umask=027 -l storage=gdata/access+gdata/hh5+gdata/tm70+gdata/vk83 -- /g/data/hh5/public/apps/miniconda3/envs/analysis3-23.07/bin/python3.10 /g/data/hh5/public/apps/miniconda3/envs/analysis3-23.07/bin/payu-run
108250574.gadi-pbs
This job runs to completion.
See #11
After cloning this repository to
/g/data/tm70/pcl851/src/coecms/access-esm
I ran the following commands, with the following output:The submitted job
108246795
fails.Specifically,
access.err
indicates that a sefault occured on 12 of the MPI ranks: