NCAR / ccpp-scm

CCPP Single Column Model
Other
13 stars 50 forks source link

Need to have ability to specify run and output directory #288

Closed gthompsnWRF closed 1 year ago

gthompsnWRF commented 2 years ago

Description

A major disadvantage of the structure of the SCM is the hardwired place for everything to run (scm/bin) that makes it very inflexible to run a set of parallel runs (of same case but using different physicss or namelists) and for redirection of the output files away from the same disk where the source code resides.

Solution

Either using a command-line option (to run_scm.py) or namelist variable to redirect the output would permit an easy way to run parallel instances of the python script using a series of parallel job scripts for changing either suite definition files and/or namelists for simultaneous runnings of the SCM. Also, the source code could be more easily kept in various HPC systems' home directory spaces that have very high backups compared to expendable model output directories in which they can easily fill up the disk quotas and prevent further runs of SCM without a bunch of manual intervention of moving files. An example is Cheyenne computer at NCAR in which we have quite limited home space, but it is backed up with high frequency and a place where I keep CODE versus the /glade/scratch or /glade/work or /glade/project spaces where I can keep model runs that can be easily replaced in event of a disk failure. The inability to keep source code of ccpp-scm in my home dir and multiple outputs of its simulations along with many other source codes is causing me strife, because I have to manage the disk space manually and semi-frequently due to quotas. I always prefer source code in /home areas and model outputs in expendable places for this reason.

As part of this issue is the fact that tracer text files are found in etc/tracer_config and yet they are linked at run time to a hard-wired file name in the run dir. The specification of that tracer file as a relative or absolute path results in failure. The same is true with the suite configuration file being hardwired into ccpp/suites and the namelists being in ccpp/physics_namelists and all of which are linked at run time. It would seriously improve disk mobility to permit the usage of relative and absolute paths to make SCM far more flexible with job scripting in parallel for a single case while changing suites and namelist options. Otherwise, users need to make copies of directories in various places to overcome this obstacle.

Example: I want to run ARM-SGP case, but set up a series of 5-10 experiments to run in parallel because each run takes on 1 CPU, but changing the SDF (suite definition file) and namelist. I don't prefer to run 5-10 different cases, which is how things are set up now for multi-run, but even that doesn't solve the disk quota problem of writing output into the scm/bin directory.

In other words, the work data flow insfrastructure lacks flexibility in its current implementation. With WRF, I compile in my /home directory, then copy executables to run directories for sensitivity experiments to avoid filling up the home partition/quota with my model results. Maybe it's possible using existing ccpp-scm, but it's not immediately obvious to me how to do so.

ligiabernardet commented 2 years ago

Greg, I think the suggestions you made here would help make a more flexible and usable CCPP SCM. As users, such as yourself, try to use the SCM in novel ways, new needs emerge. We can consider these enhancements for a future release. I appreciate you taking the time to write this down. @grantfirl Can you transfer this issue to the ccpp-scm repo? I don't seem to have the permission.

climbfuji commented 2 years ago

@gthompsnWRF I transferred this issue from ccpp-physics to ccpp-scm.