SS-Atlantis / AtlantisCmd

A command-line tool for doing various operations associated with managing runs of the Salish Sea Atlantis model. Based on https://github.com/SalishSeaCast/NEMO-Cmd/.
Apache License 2.0
0 stars 0 forks source link

Design Notes #1

Closed douglatornell closed 3 years ago

douglatornell commented 3 years ago

Design notes and discussion for a tool to manage runs of the Salish Sea Atlantis model.

The general idea is to create a new tool based on SalishSeaCast/NEMO-Cmd for running Atlantis. NEMO-Cmd is already the basis for tools for running various NEMO configurations (SalishSeaCast, GoMSS), WaveWatchIII, and FVCOM.

The goal is a command like: atlantis run run_description.yaml results_directory_path/ e.g. atlantis run 25yr.yaml /ocean/rlovindeer/MOAD/analysis-raisha/SSmodel_outputs/output-25yr/ That command will:

Other ideas:

douglatornell commented 3 years ago

@raishalovindeer I decided that I should write down what I have been thinking about on this in a way/place where we can discuss and reflect on it. If it turns out that you don't think this will add value to your workflow, it doesn't have to go any farther than discussion.

douglatornell commented 3 years ago

Here's a first cut at a run description YAML file layout with some design questions/alternatives noted in comments:

Edits in light of 24-Jun-2021 call w/ Javier:

run id: 25yr

paths:
  Atlantis code: /ocean/rlovindeer/Atlantis/atlantis-trunk/
  runs directory: /ocean/rlovindeer/Atlantis/runs/

polygons: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_xy.bgm

initial conditions: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_init.nc

forcing:
  # keys are the file/directory names that are used for the
  # symlinks created to the values of the `link to:` items

  # important design questions here!!!

  # This approach links an entire directory of forcing files into tmp run dir
  # and leaves the specification of which files from there are used to lines in forcing.prm
  inputs:
    link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs

  # This approach identifies the forcing files explicitly here
  # and potentially lets forcing.prm be generic;
  # i.e. no inputs/... just file names that match keys here.
  # This also keeps tmp run dir flat.
  SS_hydro.nc:
    link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
  SS_temp.nc:
    link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
  SS_salt.nc:
    link to: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc

parameters:
  groups: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_grps.csv
  run: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_run.prm
  forcing: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_forcing.prm   
  physics: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_physics.prm   
  biology: /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_biology.prm   

  # Another design question!!

  # Alternative is to make these lists of files that are concatenated 
  # to create the filename that is the key.
  # This is a little cumbersome because we still need to know what kind of 
  # parameter file each is
  groups:
    # examples of single file lists
    SS_grps.csv:
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_grps.csv
  run:
    SS_run.prm:
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_run.prm
  forcing:
    SS_forcing.prm:
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_forcing.prm   
  physics: 
    SS_physics.prm:
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_physics.prm   

  # very hypothetical example of breaking a big parameter file into several sections
  # I don't understand enough yet about the biology parameters files to know how
  # (or even if) it can be broken up and concatenated
  biology:
    SS_biology.prm:
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_biology.prm
      - # other files
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_migration.prm
      - # other files
      - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/SS_contaminants.prm   

output filename base: outputSalishSea

vcs revisions:
  svn:
    # can probably make this automatic because we have it already in `paths: Atlantis code:`
    - /ocean/rlovindeer/Atlantis/atlantis-trunk/
  git:
    - /ocean/rlovindeer/Atlantis/salish-sea-atlantis-model/
raishalovindeer commented 3 years ago

Thanks for this Doug. I'm really liking this approach and think it will add value because it makes the run transparent—easier for us to have a memory of what we did for each run. Your suggestions seem excellent but I'll look over this some more in detail tomorrow and see if I have any additional suggestions or strong opinions on the content of the run description YAML.

douglatornell commented 3 years ago

Great! I will try to add more notes tomorrow about structure and contents of tmp run dir that I am thinking about. Happy to do video call on Slack if/when you want to talk more synchronously about this.

douglatornell commented 3 years ago

I think I have a demo of the whole flow of a minimal atlantis run command set up now on tyee. It starts from this run description YAML file:

run id: 25yr

paths:
  Atlantis code: /ocean/dlatorne/Atlantis/atlantis-trunk/
  runs directory: /ocean/dlatorne/Atlantis/runs/

polygons: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_xy.bgm

initial conditions: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_init.nc

forcing:
  SS_hydro.nc:
    link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
  SS_temp.nc:
    link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
  SS_salt.nc:
    link to: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc

parameters:
  groups: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_grps.csv
  run: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_run.prm
  forcing: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_forcing.prm   
  physics: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_physics.prm   
  biology: /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/SS_biology.prm   

output filename base: outputSalishSea

vcs revisions:
  git:
    - /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/

Assuming that file is called 25yr.yaml, the command atlantis run 25yr.yaml /ocean/dlatorne/Atlantis/runs/25yr/ would create a temporary run directory like /ocean/dlatorne/Atlantis/runs/25yr_2021-06-30T111454.630340-0700/. That directory presently exists on tyee. Its contents are:

lrwxrwxrwx 1 dlatorne sallen       76 Jun 25 14:44 atlantisMerged -> /ocean/dlatorne/Atlantis/atlantis-trunk/atlantis/atlantismain/atlantisMerged*
-rw-rw-r-- 1 dlatorne sallen   384229 Jun 25 14:52 SS_xy.bgm
-rw-rw-r-- 1 dlatorne sallen 13295166 Jun 25 14:53 SS_init.nc
lrwxrwxrwx 1 dlatorne sallen       69 Jun 25 15:01 SS_hydro.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_hydro.nc
lrwxrwxrwx 1 dlatorne sallen       68 Jun 25 15:01 SS_salt.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_salt.nc
lrwxrwxrwx 1 dlatorne sallen       68 Jun 25 15:01 SS_temp.nc -> /ocean/dlatorne/Atlantis/salish-sea-atlantis-model/inputs/SS_temp.nc
-rw-rw-r-- 1 dlatorne sallen     5343 Jun 25 15:05 SS_grps.csv
-rw-rw-r-- 1 dlatorne sallen    13529 Jun 25 15:06 SS_physics.prm
-rw-rw-r-- 1 dlatorne sallen   766070 Jun 25 15:07 02SS_biology.prm
-rw-rw-r-- 1 dlatorne sallen     2099 Jun 25 15:09 SS_forcing.prm
-rw-rw-r-- 1 dlatorne sallen      190 Jun 25 15:23 salish-sea-atlantis-model_rev.txt
-rw-rw-r-- 1 dlatorne sallen     7370 Jun 30 16:11 SS_run.prm
-rwxrwxr-- 1 dlatorne sallen     1270 Jul  6 15:44 Atlantis.sh*
-rw-r--r-- 1 dlatorne sallen     1098 Jul  6 15:48 25yr.yml

The Atlantis.sh script gets generated by atlantis run and executed as the final step of atlantis run to launch atlantisMerged with the appropriate command-line options. The results of the run initially accumulate in the temporary run directory, so you can monitor progress there, and by looking at the stdout and stderr files in the results directory (/ocean/dlatorne/Atlantis/runs/25yr/). When the run finishes, all of the files (but not the symlinks) in the tmp run dir are moved to the results directory, and the tmp run dir is deleted.

The contents of the results directory, /ocean/dlatorne/Atlantis/runs/25yr/ at the end of the process are:

-rw-rw-r-- 1 dlatorne sallen   384229 Jun 25 14:52 SS_xy.bgm
-rw-rw-r-- 1 dlatorne sallen 13295166 Jun 25 14:53 SS_init.nc
-rw-rw-r-- 1 dlatorne sallen     5343 Jun 25 15:05 SS_grps.csv
-rw-rw-r-- 1 dlatorne sallen    13529 Jun 25 15:06 SS_physics.prm
-rw-rw-r-- 1 dlatorne sallen   766070 Jun 25 15:07 02SS_biology.prm
-rw-rw-r-- 1 dlatorne sallen     2099 Jun 25 15:09 SS_forcing.prm
-rw-rw-r-- 1 dlatorne sallen      190 Jun 25 15:23 salish-sea-atlantis-model_rev.txt
-rw-rw-r-- 1 dlatorne sallen     7370 Jun 30 16:11 SS_run.prm
-rwxrwxr-- 1 dlatorne sallen     1270 Jul  6 15:44 Atlantis.sh*
-rw-rw-r-- 1 dlatorne sallen     1098 Jul  6 15:48 25yr.yml
-rw-rw-r-- 1 dlatorne sallen        0 Jul  6 15:49 delete_to_halt_run
-rw-rw-r-- 1 dlatorne sallen    17396 Jul  6 15:49 SS_run.xml
-rw-rw-r-- 1 dlatorne sallen   111844 Jul  6 15:49 SS_grps.xml
-rw-rw-r-- 1 dlatorne sallen  1093304 Jul  6 15:49 02SS_biology.xml
-rw-rw-r-- 1 dlatorne sallen 13715452 Jul  6 15:49 outputSalishSea.nc
-rw-rw-r-- 1 dlatorne sallen   213780 Jul  6 15:49 outputSalishSeaTOT.nc
-rw-rw-r-- 1 dlatorne sallen  1504900 Jul  6 15:49 outputSalishSeaPROD.nc
-rw-rw-r-- 1 dlatorne sallen  8212818 Jul  6 15:49 log.txt
-rw-rw-r-- 1 dlatorne sallen      850 Jul  6 15:49 outputSalishSeaYOY.txt
-rw-rw-r-- 1 dlatorne sallen      871 Jul  6 15:49 outputSalishSeaSSB.txt
-rw-rw-r-- 1 dlatorne sallen     3322 Jul  6 15:49 outputSalishSeaBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen    22181 Jul  6 15:49 outputSalishSeaSpecificMort.txt
-rw-rw-r-- 1 dlatorne sallen     2184 Jul  6 15:49 outputSalishSeaMort.txt
-rw-rw-r-- 1 dlatorne sallen    48880 Jul  6 15:49 outputSalishSeaMortPerPred.txt
-rw-rw-r-- 1 dlatorne sallen   233475 Jul  6 15:49 outputSalishSeaSpecificPredMort.txt
-rw-rw-r-- 1 dlatorne sallen   227104 Jul  6 15:49 outputSalishSeaDietCheck.txt
-rw-rw-r-- 1 dlatorne sallen   234084 Jul  6 15:49 outputSalishSeaPredPropCheck.txt
-rw-rw-r-- 1 dlatorne sallen     2065 Jul  6 15:49 outputSalishSeaMigration.txt
-rw-rw-r-- 1 dlatorne sallen      859 Jul  6 15:49 outputSalishSeaVertSize.txt
-rw-rw-r-- 1 dlatorne sallen   200462 Jul  6 15:49 outputSalishSeaBoxBiomass.txt
-rw-rw-r-- 1 dlatorne sallen    41063 Jul  6 15:49 outputSalishSeaAnnualAgeBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen     9626 Jul  6 15:49 outputSalishSeaAgeBiomIndx.txt
-rw-rw-r-- 1 dlatorne sallen     2819 Jul  6 15:49 outputSalishSeaBoxLight.txt
-rw-rw-r-- 1 dlatorne sallen   159933 Jul  6 15:49 inputs.ts
-rw-rw-r-- 1 dlatorne sallen   162048 Jul  6 15:49 export.ts
-rw-rw-r-- 1 dlatorne sallen    55273 Jul  6 15:49 stderr
-rw-rw-r-- 1 dlatorne sallen    16802 Jul  6 15:49 outputSalishSeaMigrationArray.txt
-rw-rw-r-- 1 dlatorne sallen    22401 Jul  6 15:49 stdout

I shortened my testing run to 10 days, so despite its name, this isn't really a 25 year run :smile:

douglatornell commented 3 years ago

Some more notes on design issues that arose from the exercise above:

raishalovindeer commented 3 years ago

Thanks for this detailed trial run and description Doug.

First thing I notice—from your list of files that get saved in the results directory at the end, it appears we're also saving some of the original .prm files, which is excellent. Especially these:

-rw-rw-r-- 1 dlatorne sallen 13529 Jun 25 15:06 SS_physics.prm -rw-rw-r-- 1 dlatorne sallen 766070 Jun 25 15:07 02SS_biology.prm -rw-rw-r-- 1 dlatorne sallen 2099 Jun 25 15:09 SS_forcing.prm -rw-rw-r-- 1 dlatorne sallen 7370 Jun 30 16:11 SS_run.prm

Most (if not all) of the differences between runs during an investigation will be reflected inside those files and not in the run code itself, and I was wondering how we were going to capture the small changes in the .prm files for each run. Saving the .prm files with the results, just as they were used for the run, is a great feature.

raishalovindeer commented 3 years ago

What do stdout and stderr stand for? My brain reads standard output and standard error, but in the context of Atlantis, I don't know what those streams represent.

douglatornell commented 3 years ago

re: the .prm files: The idea is to capture all of the run configuration details with the run output. The hope is to enable relatively easy reproducibility, and relatively easy diff-ing of configuration between runs after the fact.

douglatornell commented 3 years ago

re: stdout and stderr:

They are exactly what you read. They are one of the base features that Linux copied from Unix. Well written code sends informational messages to stdout and errors, warnings, etc. to stderr. Things get a little murky when it comes to debugging output. Some devs and tools send it to stderr, others to stdout. So much for "standard" :smirk: The other murky thing is that the 2 streams are merged when a program's output goes to the terminal (as you see when you run Atlantis now). It is possible to capture them separately though when things are wrapped in a shell script like Atlantis.sh that AtlantisCmd will generate. The idea with AtlantisCmd is that the run can be detached from terminal output and all of the stuff that would appear on the terminal is captured in file(s) in the results directory.

The question is whether to separate them in the results directory, or dump everything into stdout so that it reads the same as what you would see if you were reading it as it streams by on the terminal now.

douglatornell commented 3 years ago

I did a little more research on the VCS recording with svn issue and found that, without having to authenticate to the server, svn info provides a bunch of useful stuff including:

svn diff also works without authentication, so we can capture uncommitted changes.

douglatornell commented 3 years ago

Closing this because we now have an initial implementation. It's close to what is written above, though the structure of the YAML run description file did evolve some; see docs.