In this example project you find a way to conduct experiments on Snellius of the following type: We have different (problem) instances on which we want to try different methods. In particular, for each instance we want to run each method and store the results. All experiment runs are independent of each other, so we can run them in parallel.
The idea is that this code can be used as a simple template for conducting experiments on Snellius. It stores the used .sh script and experiment settings in the results folder for reproducibility.
The project structure is as follows:
src\\
: Folder with project source code to be run on Snellius.experiments_settings.json
: Contains the settings of the experiments.job_script_Snellius.sh
: Contains the job script to run the experiments
on Snellius.run_experiment.py
: Contains the code to run each experiment in parallel.README.md
: Contains the documentation of the project.Snellius setup\\
: Folder with optional scripts to construct
job_script_Snellius.sh
and experiments_settings.json
and place them in
the root.The steps to run the experiments are as follows:
experiments_settings.json
file with the experiments settings
in the project's root folder. To that end, one can use Snellius setup\experiments_settings_constructor.py
which will create
experiments_settings.json
in the project's root folder. job_script_Snellius.sh
job script file in the project's root
folder. To that end, one can use Snellius setup\job_script_constructor. py
that requires that there exists an experiments_settings.json
file in the root (so perform first step 1). src
folder and ensure run_experiment.py
has
correct access to it. run_experiment.py
(just give some dummy arguments for testing). For example, you could run
run_experiment.py "instance 1" "amazing method 2" results\
to see
whether it correctly creates a .json
file in results//instance 1//amazing method 2//results.json
.poetry install
in
case the project does not have an own environment yet. job_script_Snellius.sh
on Snellius using command
sbatch job_script_Snellius.sh
(ensure that you are in the root folder
of the project). results
with a
timestamp in the root folder).Here's a detailed explanation of the provided shell script:
#!/bin/bash
# Set job requirements
#SBATCH --job-name=Snellius_example_project
#SBATCH --partition=rome
#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --time=00:10:40
#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=joost.berkhout@vu.nl
#SBATCH --output="slurm-%j.out"
# Create some variables
base_dir="$HOME/Snellius example project"
results_folder="$base_dir/$(date +"results %d-%m-%Y %H-%M-%S")"
experiments_settings="$base_dir/experiments_settings.json"
# Move to working directory and create results folder
cd "$base_dir"
mkdir -p "$results_folder"
instances=$(jq -r '.instances[]' "$experiments_settings")
methods=$(jq -r '."methods"[]' "$experiments_settings")
while read -r instance; do
while read -r method; do
srun --ntasks=1 --nodes=1 --cpus-per-task=1 poetry run python "$base_dir/run_experiment.py" "$instance" "$method" "$results_folder" &
done <<< "$methods"
done <<< "$instances"
wait
```bash
#!/bin/bash
# Set job requirements
#SBATCH --job-name=Snellius_example_project
Snellius_example_project
.#SBATCH --partition=rome
rome
.#SBATCH --nodes=1
#SBATCH --ntasks=128
#SBATCH --time=00:10:40
#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=joost.berkhout@vu.nl
#SBATCH --output="slurm-%j.out"
%j
is replaced by the job ID.# Create some variables
base_dir="$HOME/Snellius example project"
results_folder="$base_dir/$(date +"results %d-%m-%Y %H-%M-%S")"
experiments_settings="$base_dir/experiments_settings.json"
base_dir
for the project's base directory, results_folder
for the results directory with a timestamp, and experiments_settings
for the path to the JSON settings file.# Move to working directory and create results folder
cd "$base_dir"
mkdir -p "$results_folder"
base_dir
and creates the results_folder
if it doesn't exist.instances=$(jq -r '.instances[]' "$experiments_settings")
methods=$(jq -r '."methods"[]' "$experiments_settings")
jq
to parse experiments_settings.json
and extract lists of instances and methods.while read -r instance; do
while read -r method; do
srun --ntasks=1 --nodes=1 --cpus-per-task=1 poetry run python "$base_dir/run_experiment.py" "$instance" "$method" "$results_folder" &
done <<< "$methods"
done <<< "$instances"
wait
run_experiment.py
with srun
in parallel for each combination. The wait
command ensures the script waits for all background tasks to complete before finishing.jq
to extract the instances
and methods
arrays from the JSON file.while
loop, iterating over each instance
.instance
, it starts the inner while
loop, iterating over each method
.instance
and method
, it runs the experiment.instance
and method
is processed.