Closed tomchor closed 1 year ago
What about something like this?
...
model = MyFavModel(..., tracers = :c, ...)
# Don't initialize the tracer, i.e., let c be zero
spinup_time = 10minutes
simulation = Simulation(model, Δt=10.0, stop_time=spinup_time)
run!(simulation)
the_rest_of_the_time = 10days
set!(model, c=initial_c)
simulation.stop_time = spinup_time + the_rest_of_the_time
run!(simulation)
...
You don't need to be outputting c
for the spinup period. Only add an output writer for c
after the spinup?
Btw, when I am doing small exploratory runs I only need to "pay compilation time" once per model. Constructing another model in the same REPL session doesn't have any extra compilation costs unless I change something in the source code (Oceananigans source code; node my scripts).
I think my gut feeling would be against implementing such a feature, mostly because of the test and maintain part you mentioned. Also because I can see sort-of-easy ways around it. Perhaps I'm missing something? E.g. I don't really quite understand what do you mean by "safe to pick-up" and "due to the way pick-ups work" in
It makes for more complex code, especially when figuring out when it's safe to pick-up a simulation or not, due to the way pick-ups work.
I agree with Navid, it is better to have a more complicated script for cases this specific than a complicated source code.
I agree with @navidcy's points. Of course, it doesn't even matter whether the tracer is initialized or not (referring to the comment in @navidcy's script). The main point is that you can "re-initialize" a state whenever you like. I'd also like to make the extra point that @navidcy's pattern is interpretable and readable. I'm not sure we would achieve the same if we hide such a feature inside the source code. Does @navidcy's suggestion work for you @tomchor ?
I also think it is preferred to use separate files for a situation like this (though it may not be necessary to save intermediate times in the spin-up at all --- so a second file might not be necessary). If the spin up is recorded, I think it's better to use a separate file for its data.
If the spin up is expensive, the following performance optimization could be used (we can leverage the velocities
kwarg to save a bit of allocation):
# Spin up with no tracers
model = NonhydrostaticModel(; tracers=nothing, kwargs...)
simulation = Simulation(model, ...) # etc
run!(simulation)
# Run for real, re-using the old `velocities` fields but overwriting the old `model`
model = NonhydrostaticModel(; tracers=:c, velocities=model.velocities, kwargs...)
simulation = Simulation(model, ...) # etc
There is some additional memory allocation for tendencies in this case, however, so it may not work for simulations that push GPU memory.
(I'm not aware of any way to "remove" the old Simulation from the GPU memory, but if there is, then this downside can be negated.)
This occurs automatically with garbage collection, provided that there's no reference to the old simulation in the name space. CUDA may have a way to manually call the garbage collector, after doing something like model=nothing; simulation=nothing
. If you figure that out, it'd be nice to know.
It makes for more complex code, especially when figuring out when it's safe to pick-up a simulation or not, due to the way pick-ups work.
I don't follow, but if there's something to improve about picking simulations up we should pursue that.
PS @milankl may be interested in this discussion, because it illustrates the importance of being able to modify tracer fields mid-run rather than requiring that they are initialized when the model is built.
Thanks, everyone. I agree with the major points here. To answer some specific comments:
Btw, when I am doing small exploratory runs I only need to "pay compilation time" once per model. Constructing another model in the same REPL session doesn't have any extra compilation costs unless I change something in the source code (Oceananigans source code; node my scripts).
This is mostly because the majority of my exploratory runs are run in the GPU, and since I have limited GPU time I try to not leave interactive GPU sessions open. If I unlimited access to a GPU (or in the cases where I can explore on the CPU), then I agree with your point.
I agree with Navid, it is better to have a more complicated script for cases this specific than a complicated source code.
Again, agree. I posted this more because, if this was something a lot of other people were doing, it might be worth to maintain the infrastructure. But since it sounds like that's not the case, then I agree it's best to have complex user scripts and keep the source code simple.
Does @navidcy's suggestion work for you @tomchor ?
Yes, thanks for the suggestion @navidcy. I think this is the next best thing. The one disadvantage for me is that is "wastes" computation advecting tracers in the spin-up, but it has the huge advantage of keeping the source code simple, with also a readable user script.
Just for future reference, I ended up modifying @navidcy's approach and using a Callback
to set the concentrations mid-simulation. It proved to be easier.
Huh, why is that easier? This judgement might be specific to your use case, so I wouldn't recommend taking that approach in general (for future readers of this issue). There is an important downside: the script is harder to read and interpret. So I'd regard that as less-than-best practice unless there's some specific reason to do it. (One reason could be: many similar simulations are being created using a function.)
I thought of another way to achieve this performance optimization without any source-code-specific feature. For a simulation without buoyancy, for example, this might work:
# Construct the full model with all desired tracers
model = NonHydrostaticModel(; grid, tracers, ...)
model_properties = []
for name in propertynames(model)
if name == :tracers
push!(model_properties, nothing)
else
push!(model_properties, getproperty(model, name))
end
end
# Build a "spin-up" model using the inner constructor for NonhydrostaticModel, with tracers=nothing
spin_up_model = NonhydrostaticModel(model_properties...)
I believe that spin_up_model
will run without evolving tracers.
If you require some tracers (ie active tracers like buoyancy) then things are slightly more complicated. You have to replace tracers=nothing
with the appropriate NamedTuple
. Also, I think you need to ensure that the active tracers are "first" in the list of tracers when the full model is built. Something like
# Construct the full model with all desired tracers
model = NonHydrostaticModel(; grid, tracers=(T, S, c1, c2, c3, c4), ...)
active_tracers = (T = model.tracers.T, S = model.tracers.S)
model_properties = []
for name in propertynames(model)
if name == :tracers
push!(model_properties, active_tracers)
else
push!(model_properties, getproperty(model, name))
end
end
# Build a "spin-up" model using the inner constructor for NonhydrostaticModel, with tracers=(; T, S)
spin_up_model = NonhydrostaticModel(model_properties...)
@tomchor I think it's best to convert this to a discussion. Is that ok with you?
@tomchor I think it's best to convert this to a discussion. Is that ok with you?
Yup. Go ahead
Sometimes it's useful to run a simulation with tracers and have the tracers start evolving only after a certain spin-up period. Since https://github.com/CliMA/Oceananigans.jl/pull/2938 this is possible by running a simulation without tracers, saving a checkpoint file from it, building a new model that is identical to the previous one but with extra tracers, manually
set!()
the prognostic fields from the checkpoint file, and then run that model. This works as intended, but as I've realized since then it's got a few downsides. In no particular order:Simulation
s can take a long time for small runs. When your code is production-ready and you're running the final big simulations that's okay, since those extra minutes of compilation are small compared to the many hours of run. But when you're doing small exploratory simulations (which may be the majority of times you run your code), then that extra compilation time required to build an extra model, simulation, writers, etc., can increase the code run-time significantly.Simulation
from the GPU memory, but if there is, then this downside can be negated.)I think if we implement a way to build just one model which has every tracer needed and (optionally ofc) specify start times for each tracer (before which the tracers would just not be evolved in time) it would solve all the of the problems above.
The downsides that I can think of are:
Maybe the biggest issue is that I'm not sure this is something enough people actually want to do with their simulations. This has been a relatively common thing for my research, and I know @whitleyv does this too, but maybe we're the exception?
cc @whitleyv