Closed josuemtzmo closed 4 years ago
Are those runs sharing the same config directory (i.e. the one containing config.yaml
and your text inputs)? If so, then I would not expect this to work, and git
fails would be the least if your problems.
If you need to do concurrent runs, then each run needs its own config directory with a unique name.
What @marshallward said ... but you can use git clone
and git branch
to keep all the configuration in a single repo, but clone to different directory names if that helps.
The runs share a similar config file :
ncpus: 1
mem: 50GB
walltime: 07:00:00
jobname: PR_LAV_{0}
project: x77
queue: express
qsub_flags: -lother=hyperthread -W umask=027 -l storage=gdata/v45+scratch/v45+gdata/hh5+gdata/x77
model: mitgcm
shortpath: /scratch/x77
exe: mitgcm_HR_satellite_P_release
input: global_particle_release/30d/30d_slice_chunk_{0}
collate: True
userscripts:
archive: clear_archive.sh
However, each submission is executed in individual folders, with a unique config.yaml file (I'm replacing {0} with the corresponding experiment run) generated by my submission script:
#!/bin/bash
#Load modules & global variables
module use /g/data3/hh5/public/modules
module load conda/analysis3-unstable
globalpath=`pwd`
count=0
cc=0
n=25
particle_grid='flt_global_hex_032deg.bin'
# input path
input_path='/scratch/x77/jm5970/mitgcm/input/global_particle_release'
# Loop for every initialization of the particle release:
for tt in `seq 0 100`
do
# Create folder for running experiment.
folder="30d_LADV_part_release_$(printf %05d ${tt%})"
mkdir $folder
# Modify corresponding files to setup the experiment.
cp ./input/* $folder/.
sed s-input_off-'.'-g input/data.off > "$folder/data.off"
sed s-flt_global_hex_10deg.bin-${particle_grid}-g input/data.flt > "$folder/data.flt"
sed s-{0}-$(printf %05d ${tt%})-g config_sed.yaml > "$folder/config.yaml"
sed s-{0}-30d_slice_chunk_$(printf %05d ${tt%})-g input/clear_archive.sh > "$folder/clear_archive.sh"
cd $folder
ln -s $input_path/${particle_grid} $input_path/30d/30d_slice_chunk_$(printf %05d ${tt%})/
# Run the experiment.
payu run -i 0
cd $globalpath
count=$((count+1))
# Sleep for 1 hours so the process can be executed, without over queuing PBS.
if [ $cc -eq $n ]
then
cc=0
echo "Sleep submission"
sleep 1h
else
cc=$((cc+1))
fi
done
So I think I'm following the expected workflow of payu.
I can't access your directory, but I'm guessing you're making control subdirectories within a directory that is itself a git
repo.
So either make the control directories somewhere else, or add runlog: False
to your config.yaml
so that it doesn't do git
stuff.
Yes, I'm creating control subdirectories within a git repo directory, I've also changed the group of the folder to 'v45' so perhaps you can access it now. I'm resubmitting the jobs with the flag hoping it will solve the issue.
Your home directory isn't group readable/executable
I've changed the rights. Thanks for pointing this out!
I'm closing this issue as using the flag or moving the files solved it.
I'm submitting multiple runs (n=100) of the same overall configuration in MITgcm, however, the initial conditions change for each run. From the 100 simulations that I tried to execute, only 5 run as expected while 95% crashed with git errors.
Is there a flag to stop payu from use git and adding files when it is executed for each experiment?