JuliaDynamics / DrWatson.jl

The perfect sidekick to your scientific inquiries
https://juliadynamics.github.io/DrWatson.jl/stable/
Other
825 stars 91 forks source link

Add Abiliity to Pass Custom Load Function to `collect_results!` #420

Open NuclearPowerNerd opened 1 month ago

NuclearPowerNerd commented 1 month ago

Firstly, thank you writing this amazing package. I love it and have used it for over a year now in my academic work.

I have been trying to take full advantage of every capability DrWatson has to offer and I've pretty much done that except for collect_results!. I'd really like to be able to take advantage of this because I have many many simulations and a summary table of the most important parameters would really help when preparing results and reviewing what additional runs I need to make.

Here is my problem and I apologize if there is already an obvious solution and I missed it. I save all my simulation results in a struct around which I've built a lot of methods. My structs contain all the parameters that are important to my simulations as well as my results themselves. Something like this

struct MySim
    Param1::Float64
    Param2::Int64
    Result1::Matrix{Float64}
    # and so on, many more 
end

It is important that when I save this to disc, it is saved as a struct and not as a dict. That means that when I save it I do something like this

wsave("full_path_to_result.jld2", "solution", myresult) # where myresult is of type MySim

And when I load I have to do

result = wload("full_path_to_result.jld2")["solution"]

But this seems problematic for collect_results! which expects (as far as I can tell) that myresult has passed through struct2dict. I really only need this step if I am printing a summary table to file. Thus, I'm wondering can we add a kwarg to collect_results! that wraps the call to wload just within the source of collect_results!? I'm happy to write the pull request but I wanted to make sure this was sensible before I did. I'm thinking something like this.

function collect_results!(filename, folder;
    valid_filetypes = [".bson", "jld", ".jld2"],
    subfolders = false,
    rpath = nothing,
    verbose = true,
    update = false,
    newfile = false, # keyword only for defining collect_results without !
    rinclude = [r""],
    rexclude = [r"^\b$"],
    load_function = () -> ()
    kwargs...)

# Skipping some code here
data = load_function(wload(filename))

This way, I could implement a load function like this and I'd get the desired behavior.

load_function = (d) -> struct2dict(d["solution"])

Perhaps this is too specialized for my use case? I'm not sure. Please let me know if this could be achieved in a better fashion or if I am missing something here. The only thing I insist on is that I'd like to maintain saving my results as structs and not converting them to dictionaries. If I converted them to dicts I'd have to define new functions to rebuild the structs from the loaded dicts which I'd like to avoid.

Datseris commented 1 month ago

thanks for the good words!

The simple solution i see is to add a new keyword argument to collect_results that is the load function itself. It would default to wload but could be overloaded by anything. Such as your load_function above.

This would be a very simple change to do in a Pull Request if you are up for it!

NuclearPowerNerd commented 1 month ago

Created new PR, now in #424