Usage of DrWatson for long-running simulations

sebastianpech commented 4 years ago

So as we are currently discussing various DrWatson workflows, I decided to explain mine for long-running simulations.

Some comments in advances

Turned out a bit longer. After polishing and discussion might be worth putting into the real world examples
I'm definitely not utilizing all the features of DrWatson (actually only a small subset) and maybe some of the stuff I do might actually be worth adapting or being implemented as a new feature, let see. I'm hoping for some discussion.
I'm not sure I covered every aspect, will update the text accordingly

The problem

Currently I'm running simulations on a remote machine (not a cluster), which I also use for development, so everything happens in an ssh session. This means, when I start a job eg. by julia run.jl SENB101, where SENB101 defines the parameter set to be used for the simulation, I need to keep the session open to keep the job running.

Therefore, to have a persistent running session (also in parallel) I use tmux on the remote system for spawning jobs. Effectively, this boils down to creating a new window in an existing tmux session running the above command. I can also start multiple simulations in parallel like so:

./run.sh SENB SENB{206..209}

run.sh creates a new tmux session SENB, loops over the names defined by SENB{206..209} and creates a new window for each of them, running the simulation process.

During the simulation multiple files are created. (This is just an overview of the ones important for the discussion). Simulations usually run for ~ 12 h.

log file
pvd file grouping the vtu files from folder
folder
- bson file for each time increment
- vtu file for each time increment

Parameter definition

I use Parameters.jl for defining the simulation parameters, one definition looks eg. like this

@with_kw struct SENB
    inp::String
    mat::String
    pmodel::String
    normalize_penalty::Bool=false
    or::String="none"
    β::Float64 = 1.0
    α::Float64 = 5.0
    Gc1::Float64 = 1.0
    ξ::Float64 = 2.0
    p::Int = 2
    ft1::Float64=60.0
    threshold_intensity::Int = 20
    @assert threshold_intensity>=1
end

The main advantages for me, over using Base.@kwdef are

the @assert call and
that I get @pack and @unpack. Given one of the latest PRs (https://github.com/JuliaDynamics/DrWatson.jl/pull/148) this is not so important anymore.

The main advantages for me, over using Dicts are

the above,
default values and most importantly
I can dispatch on the type SENB. Meaning I have functions run(p::SENB), run(p::DCB), ... which start the simulation with the given parameter p but do different steps during initialization eg. specific definition of boundary conditions.

All parameters that I use in the simulation are stored in a dict in the file scripts/config.jl like so:

parameters = Dict(
    "SENB13" => SENB(...),
    "SENB14" => SENB(...),
    "DCB001" => DCB(...),
)

This file in included in every script with include(scriptsdir("config.jl")).

Tying parameter configurations to the folder structure

DrWatson's savename allows to create a name from a parameter configuration like

SENB_Gc1=1_Gc2=0.09_Gc3=0.36_ft1=60_ft2=4.2_ft3=4.5_inp=Moura2008-Fine_mat=Moura2008_normalize_penalty=false_p=2_pmodel=T_threshold_intensity=50_α=5_β=2_ξ=2

I use this name for naming the above folder, log-file and pvd-file. This way every file is related to the parameter configuration, either by its name or the folder it is stored in. In Finder this looks like this:

For analysing the results by code, I don't really mind the folder structure. I can create paths from the parameter configuration. However, finding the one file with Gc1=1 and pmodel=T is though. So for this situation it's much easier to have a folder structure like

Therefore, I always prepend the parameter set identifier to the output directory, so the command using savename looks eg like this

datadir("sims", arg, savename("SENB", parameters[arg]))

This way I can easily find the one file I'm looking for, while still knowing which parameter configuration created it. Given the discussion in https://github.com/JuliaDynamics/DrWatson.jl/issues/151, the folder structure could be a lot cleaner by storing the metadata in a central database instead of the filename.

Features I can't use

Because I have no single result file, I can't use

any of the tagging function, unless I store an extra file just containing the commit info
produce_or_load
collect_results

Features I can use

All path functions datadir, scriptsdir
The folder structure
@quickactivate :Projectname allowing loading of default configurations

Features that I would like to have

just dumping them here, not thought through

Easier handling of multiple files coming from one simulation
Attaching metadata to any file or folder
Possible some more, will update

JonasIsensee commented 4 years ago

You mentioned a central database and attaching metadata to files/ folders.

I suppose one could make it an option (or manually) to write metadata in a simple format such as .json along with the files. (Same name or just in the same folder if every simulation gets its own folder) Then one could probably make collect_results read just the metadata files i.e. by restricting the suffix to .json. That would create a *central database` that could hold whatever you want it to. (Limited by what you care to put into the metadata files)

mbruna commented 3 years ago

Hi Sebastian, thanks for this post. I'm starting with DrWatson and I think your suggestion here of using @with_kw struct will suit my project better than using Dict. One question I have though is: how easy it is to adapt the standard workflow in DrWatson using Dicts to structs? In particular, I am running many simulations in parallel using pmap, and I found the feature of creating a Dict of arrays (arrays for those parameters that I am varying over simulations) and then the DrWatson function dict_list() very convenient. What would be the best way to set up sth similar but with struct? Thanks!

Datseris commented 3 years ago

I think you should open up a new Issue, as a feature request, that asks for the equivalent of dict_list function for structs. I believe it will be something really easy to do!

EDIT: Just do what Jonas said, much smarter :P

JonasIsensee commented 3 years ago

Hi @mbruna ,

if you want to pass structs to your simulations as the parameters, I think it could be easiest to do this:

julia> using DrWatson

julia> Base.@kwdef struct MyParams
           α::Float64 = 1.0
           β::Int = 42
       end
MyParams

julia> params = dict_list(Dict(
           :α => [2.0, 3.0],
           :β => @onlyif(:α == 3, 4)
           )) .|> (p->MyParams(; pairs(p)...))
2-element Vector{MyParams}:
 MyParams(2.0, 42)
 MyParams(3.0, 4)

So essentially you use everything as normal and only this bit at the end .|> (p->MyParams(; pairs(p)...))

creates a struct by passing the dictionary fields to the constructor.

sebastianpech commented 3 years ago

Hi @mbruna, so the post is a bit outdated and I switched to a better approach that uses dicts only for the application of dict_list. I created a separate package (https://github.com/sebastianpech/DrWatsonSim.jl) for those long-running simulations. At the moment it is in parts pretty much tailored to how I run my simulations, however, it can easily be adapted and generalized. So you can look into that.

With regards to passing structs, I'm basically doing it like @JonasIsensee suggested.

mbruna commented 3 years ago

Thanks @JonasIsensee, @sebastianpech, @Datseris for your prompt replies. That's exactly what I was after, thanks.

JuliaDynamics / DrWatson.jl