Closed douglatornell closed 3 years ago
@raishalovindeer I decided that I should write down what I have been thinking about on this in a way/place where we can discuss and reflect on it. If it turns out that you don't think this will add value to your workflow, it doesn't have to go any farther than discussion.
Here's a first cut at a run description YAML file layout with some design questions/alternatives noted in comments:
Edits in light of 24-Jun-2021 call w/ Javier:
polygons
item to copy .bgm
file into tmp run dir where Atlantis expects it to be via its mention as a global attr in the init.nc
fileinitial conditions
item to copy init.nc
file into tmp run dir instead of symlinking it because it is not that large (13M for Salish Sea model) and ultimately keeping it with run results (even if it changes rarely) seems like a good idea; copy vs. symlink is definitely open for discussionrun id: 25yr
paths:
Atlantis code: /ocean/rlovindeer/Atlantis/atlantis-trunk/
runs directory: /ocean/rlovindeer/Atlantis/runs/
polygons: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_xy.bgm
initial conditions: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_init.nc
forcing:
# keys are the file/directory names that are used for the
# symlinks created to the values of the `link to:` items
# important design questions here!!!
# This approach links an entire directory of forcing files into tmp run dir
# and leaves the specification of which files from there are used to lines in forcing.prm
inputs:
link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs
# This approach identifies the forcing files explicitly here
# and potentially lets forcing.prm be generic;
# i.e. no inputs/... just file names that match keys here.
# This also keeps tmp run dir flat.
SS_hydro.nc:
link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
SS_temp.nc:
link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
SS_salt.nc:
link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc
parameters:
groups: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_grps.csv
run: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_run.prm
forcing: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_forcing.prm
physics: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_physics.prm
biology: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_biology.prm
# Another design question!!
# Alternative is to make these lists of files that are concatenated
# to create the filename that is the key.
# This is a little cumbersome because we still need to know what kind of
# parameter file each is
groups:
# examples of single file lists
SS_grps.csv:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_grps.csv
run:
SS_run.prm:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_run.prm
forcing:
SS_forcing.prm:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_forcing.prm
physics:
SS_physics.prm:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_physics.prm
# very hypothetical example of breaking a big parameter file into several sections
# I don't understand enough yet about the biology parameters files to know how
# (or even if) it can be broken up and concatenated
biology:
SS_biology.prm:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_biology.prm
- # other files
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_migration.prm
- # other files
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_contaminants.prm
output filename base: outputSalishSea
vcs revisions:
svn:
# can probably make this automatic because we have it already in `paths: Atlantis code:`
- /ocean/rlovindeer/Atlantis/atlantis-trunk/
git:
- /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/
Thanks for this Doug. I'm really liking this approach and think it will add value because it makes the run transparent—easier for us to have a memory of what we did for each run. Your suggestions seem excellent but I'll look over this some more in detail tomorrow and see if I have any additional suggestions or strong opinions on the content of the run description YAML.
Great! I will try to add more notes tomorrow about structure and contents of tmp run dir that I am thinking about. Happy to do video call on Slack if/when you want to talk more synchronously about this.
I think I have a demo of the whole flow of a minimal atlantis run
command set up now on tyee
. It starts from this run description YAML file:
run id: 25yr
paths:
Atlantis code: /ocean/dlatorne/Atlantis/atlantis-trunk/
runs directory: /ocean/dlatorne/Atlantis/runs/
polygons: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_xy.bgm
initial conditions: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_init.nc
forcing:
SS_hydro.nc:
link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
SS_temp.nc:
link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
SS_salt.nc:
link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc
parameters:
groups: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_grps.csv
run: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_run.prm
forcing: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_forcing.prm
physics: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_physics.prm
biology: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_biology.prm
output filename base: outputSalishSea
vcs revisions:
git:
- /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/
Assuming that file is called 25yr.yaml
, the command atlantis run 25yr.yaml /ocean/dlatorne/Atlantis/runs/25yr/
would create a temporary run directory like /ocean/dlatorne/Atlantis/runs/25yr_2021-06-30T111454.630340-0700/
. That directory presently exists on tyee
. Its contents are:
lrwxrwxrwx 1 dlatorne sallen 76 Jun 25 14:44 atlantisMerged -> /ocean/dlatorne/Atlantis/atlantis-trunk/atlantis/atlantismain/atlantisMerged*
-rw-rw-r-- 1 dlatorne sallen 384229 Jun 25 14:52 SS_xy.bgm
-rw-rw-r-- 1 dlatorne sallen 13295166 Jun 25 14:53 SS_init.nc
lrwxrwxrwx 1 dlatorne sallen 69 Jun 25 15:01 SS_hydro.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
lrwxrwxrwx 1 dlatorne sallen 68 Jun 25 15:01 SS_salt.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc
lrwxrwxrwx 1 dlatorne sallen 68 Jun 25 15:01 SS_temp.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
-rw-rw-r-- 1 dlatorne sallen 5343 Jun 25 15:05 SS_grps.csv
-rw-rw-r-- 1 dlatorne sallen 13529 Jun 25 15:06 SS_physics.prm
-rw-rw-r-- 1 dlatorne sallen 766070 Jun 25 15:07 02SS_biology.prm
-rw-rw-r-- 1 dlatorne sallen 2099 Jun 25 15:09 SS_forcing.prm
-rw-rw-r-- 1 dlatorne sallen 190 Jun 25 15:23 salish-sea-atlantis-model_rev.txt
-rw-rw-r-- 1 dlatorne sallen 7370 Jun 30 16:11 SS_run.prm
-rwxrwxr-- 1 dlatorne sallen 1270 Jul 6 15:44 Atlantis.sh*
-rw-r--r-- 1 dlatorne sallen 1098 Jul 6 15:48 25yr.yml
The Atlantis.sh
script gets generated by atlantis run
and executed as the final step of atlantis run
to launch atlantisMerged
with the appropriate command-line options. The results of the run initially accumulate in the temporary run directory, so you can monitor progress there, and by looking at the stdout
and stderr
files in the results directory (/ocean/dlatorne/Atlantis/runs/25yr/
). When the run finishes, all of the files (but not the symlinks) in the tmp run dir are moved to the results directory, and the tmp run dir is deleted.
The contents of the results directory, /ocean/dlatorne/Atlantis/runs/25yr/
at the end of the process are:
-rw-rw-r-- 1 dlatorne sallen 384229 Jun 25 14:52 SS_xy.bgm
-rw-rw-r-- 1 dlatorne sallen 13295166 Jun 25 14:53 SS_init.nc
-rw-rw-r-- 1 dlatorne sallen 5343 Jun 25 15:05 SS_grps.csv
-rw-rw-r-- 1 dlatorne sallen 13529 Jun 25 15:06 SS_physics.prm
-rw-rw-r-- 1 dlatorne sallen 766070 Jun 25 15:07 02SS_biology.prm
-rw-rw-r-- 1 dlatorne sallen 2099 Jun 25 15:09 SS_forcing.prm
-rw-rw-r-- 1 dlatorne sallen 190 Jun 25 15:23 salish-sea-atlantis-model_rev.txt
-rw-rw-r-- 1 dlatorne sallen 7370 Jun 30 16:11 SS_run.prm
-rwxrwxr-- 1 dlatorne sallen 1270 Jul 6 15:44 Atlantis.sh*
-rw-rw-r-- 1 dlatorne sallen 1098 Jul 6 15:48 25yr.yml
-rw-rw-r-- 1 dlatorne sallen 0 Jul 6 15:49 delete_to_halt_run
-rw-rw-r-- 1 dlatorne sallen 17396 Jul 6 15:49 SS_run.xml
-rw-rw-r-- 1 dlatorne sallen 111844 Jul 6 15:49 SS_grps.xml
-rw-rw-r-- 1 dlatorne sallen 1093304 Jul 6 15:49 02SS_biology.xml
-rw-rw-r-- 1 dlatorne sallen 13715452 Jul 6 15:49 outputSalishSea.nc
-rw-rw-r-- 1 dlatorne sallen 213780 Jul 6 15:49 outputSalishSeaTOT.nc
-rw-rw-r-- 1 dlatorne sallen 1504900 Jul 6 15:49 outputSalishSeaPROD.nc
-rw-rw-r-- 1 dlatorne sallen 8212818 Jul 6 15:49 log.txt
-rw-rw-r-- 1 dlatorne sallen 850 Jul 6 15:49 outputSalishSeaYOY.txt
-rw-rw-r-- 1 dlatorne sallen 871 Jul 6 15:49 outputSalishSeaSSB.txt
-rw-rw-r-- 1 dlatorne sallen 3322 Jul 6 15:49 outputSalishSeaBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen 22181 Jul 6 15:49 outputSalishSeaSpecificMort.txt
-rw-rw-r-- 1 dlatorne sallen 2184 Jul 6 15:49 outputSalishSeaMort.txt
-rw-rw-r-- 1 dlatorne sallen 48880 Jul 6 15:49 outputSalishSeaMortPerPred.txt
-rw-rw-r-- 1 dlatorne sallen 233475 Jul 6 15:49 outputSalishSeaSpecificPredMort.txt
-rw-rw-r-- 1 dlatorne sallen 227104 Jul 6 15:49 outputSalishSeaDietCheck.txt
-rw-rw-r-- 1 dlatorne sallen 234084 Jul 6 15:49 outputSalishSeaPredPropCheck.txt
-rw-rw-r-- 1 dlatorne sallen 2065 Jul 6 15:49 outputSalishSeaMigration.txt
-rw-rw-r-- 1 dlatorne sallen 859 Jul 6 15:49 outputSalishSeaVertSize.txt
-rw-rw-r-- 1 dlatorne sallen 200462 Jul 6 15:49 outputSalishSeaBoxBiomass.txt
-rw-rw-r-- 1 dlatorne sallen 41063 Jul 6 15:49 outputSalishSeaAnnualAgeBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen 9626 Jul 6 15:49 outputSalishSeaAgeBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen 2819 Jul 6 15:49 outputSalishSeaBoxLight.txt
-rw-rw-r-- 1 dlatorne sallen 159933 Jul 6 15:49 inputs.ts
-rw-rw-r-- 1 dlatorne sallen 162048 Jul 6 15:49 export.ts
-rw-rw-r-- 1 dlatorne sallen 55273 Jul 6 15:49 stderr
-rw-rw-r-- 1 dlatorne sallen 16802 Jul 6 15:49 outputSalishSeaMigrationArray.txt
-rw-rw-r-- 1 dlatorne sallen 22401 Jul 6 15:49 stdout
I shortened my testing run to 10 days, so despite its name, this isn't really a 25 year run :smile:
Some more notes on design issues that arose from the exercise above:
svn
is very different to git
. You can't do svn log -l 1
to get information about the working copy rev without accessing the svn
server. That requires authentication for atlantis-trunk
which would greatly complicate things. I need to think more about how to get a VCS record for atlantis-trunk
similar to what is in salish-sea-atlantis-model_rev.txt
for the salish-sea-atlantis-model
Git repo clone.
The console output from atlantisMerged
is a mixture of output to the stdout
and stderr
streams. I've captured them separately in my example above. They could alternatively be captured as an interleaved stream in stdout
if that makes more sense. Note that stdout
also gets messages from the Atlantis.sh
script for things that happen before and after the actual run of atlantisMerged
.
Thanks for this detailed trial run and description Doug.
First thing I notice—from your list of files that get saved in the results directory at the end, it appears we're also saving some of the original .prm files, which is excellent. Especially these:
-rw-rw-r-- 1 dlatorne sallen 13529 Jun 25 15:06 SS_physics.prm -rw-rw-r-- 1 dlatorne sallen 766070 Jun 25 15:07 02SS_biology.prm -rw-rw-r-- 1 dlatorne sallen 2099 Jun 25 15:09 SS_forcing.prm -rw-rw-r-- 1 dlatorne sallen 7370 Jun 30 16:11 SS_run.prm
Most (if not all) of the differences between runs during an investigation will be reflected inside those files and not in the run code itself, and I was wondering how we were going to capture the small changes in the .prm files for each run. Saving the .prm files with the results, just as they were used for the run, is a great feature.
What do stdout
and stderr
stand for? My brain reads standard output and standard error, but in the context of Atlantis, I don't know what those streams represent.
re: the .prm files: The idea is to capture all of the run configuration details with the run output. The hope is to enable relatively easy reproducibility, and relatively easy diff-ing of configuration between runs after the fact.
re: stdout
and stderr
:
They are exactly what you read. They are one of the base features that Linux copied from Unix. Well written code sends informational messages to stdout
and errors, warnings, etc. to stderr
. Things get a little murky when it comes to debugging output. Some devs and tools send it to stderr
, others to stdout
. So much for "standard" :smirk: The other murky thing is that the 2 streams are merged when a program's output goes to the terminal (as you see when you run Atlantis now). It is possible to capture them separately though when things are wrapped in a shell script like Atlantis.sh
that AtlantisCmd will generate. The idea with AtlantisCmd is that the run can be detached from terminal output and all of the stuff that would appear on the terminal is captured in file(s) in the results directory.
The question is whether to separate them in the results directory, or dump everything into stdout
so that it reads the same as what you would see if you were reading it as it streams by on the terminal now.
I did a little more research on the VCS recording with svn
issue and found that, without having to authenticate to the server, svn info
provides a bunch of useful stuff including:
git
are the list of files that were touched in the most recent commit, and the commit message. I'm fuzzy on whether svn
has the git
concept of commit and push to server, or if svn commit
communicates directly with the server. So, I'm not sure if there is a case where svn info
would show something different for changes that have been committed but not pushed (if that's even a thing).svn diff
also works without authentication, so we can capture uncommitted changes.
Closing this because we now have an initial implementation. It's close to what is written above, though the structure of the YAML run description file did evolve some; see docs.
Design notes and discussion for a tool to manage runs of the Salish Sea Atlantis model.
The general idea is to create a new tool based on SalishSeaCast/NEMO-Cmd for running Atlantis. NEMO-Cmd is already the basis for tools for running various NEMO configurations (SalishSeaCast, GoMSS), WaveWatchIII, and FVCOM.
The goal is a command like:
atlantis run run_description.yaml results_directory_path/
e.g.atlantis run 25yr.yaml /ocean/rlovindeer/MOAD/analysis-raisha/SSmodel_outputs/output-25yr/
That command will:Other ideas: