aekiss / run_summary

Summarise ACCESS-OM2 runs
Apache License 2.0
1 stars 1 forks source link

run_summary.py

Creates an Excel-compatible .csv file summarising ACCESS-OM2 experiments, tabulating each run number and its dates, PBS job id, walltime, service units, timestep, file changes, git hashes, git commit messages, all namelist changes, and more. You can also easily customise what is output, or output everything to a yaml file.

ACCESS-OM2 run summaries generated by run_summary.py are in /g/data/hh5/tmp/cosima/access-om2-run-summaries on NCI. See this notebook for examples of what can be done with this data.

Usage

The simplest way to run it is to put run_summary.py and nmltab.py (from here) in the run control directory you want to summarise, then type ./run_summary.py. After some processing (which might take a few minutes) it will generate a .csv file summarising your runs which you can open in Excel or similar.

You can also put run_summary.py and nmltab.py anywhere in your search path and then do run_summary.py my/control/dir/path to specify which ACCESS-OM2 control directory to summarise. You can also use wildcards to summarise multiple ACCESS-OM2 control directories.

Usage details:

usage: run_summary.py [-h] [-f] [-l] [-d] [-o file] [--outfile_syncdir] [path [path ...]]

positional arguments:
  path                  zero or more ACCESS-OM2 control directory paths;
                        default is current working directory

optional arguments:
  -h, --help            show this help message and exit
  -f, --show_fails      include failed runs (disables some output columns)
  -l, --list            list all data that could be tabulated by adding it to
                        output_format
  -d, --dump_all        also dump all data to <outfile>.yaml
  -o file, --outfile file
                        output file path; default is 'run_summary_<path>.csv';
                        overrides --outfile_syncdir if set. WARNING: output
                        file will be overwritten
  --outfile_syncdir     set output file path to 'run_summary_<sync dir
                        path>.csv' or 'run_summary_<path>.csv' if sync dir
                        path is invalid; ignored if '-o', '--outfile' is set.
                        WARNING: output file will be overwritten
  --no_header           don't write header rows in output .csv
  --no_stats            don't output summary statistics

Customising the .csv output

To customise what is output, simply edit output_format in run_summary.py. You can also change the summary statistics by editing stats.

run_summary.py collects much more data than it outputs by default to the .csv file so there are plenty of extra things to add if you want them. Run with the --list option to see a list of available data you can add to output_format (but you may need to edit some keys to ensure uniqueness). Changes to any variable in any .nml file will automatically be output.

Requirements