Closed matt-graham closed 11 months ago
Hi Matt, thanks for reaching out! I've seen your ParticleDA.jl efforts and would love to collaborate. I think I've talked with one of you also at last year's JuliaCon in London. Great to see that there's progress!!
1) v0.5 vs v0.6: I see that you have been using the current release v0.5, however could I ask you to switch to the main branch? I'll release it soon, just adding some documentation these days. Reason is that with v0.6 the interface to changes, as we want to deprecate the run_speedy!
/initialize_speedy!
interface. You can see examples of the new interface in How to run SpeedyWeather.jl
2) But good news is that the underlying logic you are worried about regarding prognostic/diagnostic variables and the model struct isn't changing: As you correctly figured out (note to self to document this better!) we have (with small exceptions see below) a) the prognostic variables define the current state of the model, the diagnostic variables are entirely determined by the prognostic variables and the model b) the model is constant at runtime c) Unless the initital conditions are chosen after the model creation the model can reconstruct the prognostic variables
So to your point of "make me unsure" that's indeed as you describe.
However, the exceptions are as follows
1) Clock
(which is only in v0.6 btw) sits in model and counts up the time, at the moment this is only used for output. There are conceptually some parameterizations like radiation that would use Clock
to determine things like time of day, radiation at that moment and therefore a temperature tendency so that indeed Clock
can influence the prognostic variables. This sounds like a good reason to me to move the Clock
into the prognostic variables, but if you are worried about this for now: there's currently no active parameterization that uses Clock
and so it's only used for output purposes. Going forward, there'll always be a way to disable that, e.g. the radiation code will be able to have daily_cycle=false
, seasonal_cycle=false
such that the Clock
state doesn't matter for the future evolution of the prognostic variables.
2) leapfrogging: Yes I agree with what you say. At the beginning of the model integration only lf=1
matters and lf=2
is overwritten due to the Euler forward step. During the model integration (i.e. for each time step after the first) both lf=1
and lf=2
determine the future state of the prognostic variables.
3) Yes, the idea is that information from the diagnostic variables propagates from one time step to the next only through the prognostic variables. The tendencies are completely overwritten every time. I do want to stick to that logic and there's currently only two places where we have to be careful (both of which can be disabled):
a) Horizontal diffusion: At the moment there's an option to have HyperDiffusion(;spectral_grid,adaptive=true)
which means that based on the current state of the diagnostic variables, the hyper diffusion "constants" are recalculated to allow for a more selective diffusion which therefore would impact the prognostic variables. But at the moment this does not leave any information left in diagnostic variables/model and is done every time step.
b) Semi-implicit time stepping. This requires a vertical temperature profile, which can be chosen as constant ImplicitPrimitiveEq(;spectral_grid,recalculate=typemax(Int))
(reminder to self to allow Inf
to be passed on too), or can occasionally be recalculated. When recalculated there would be information propagating from a previous diagnostic state to the model (where the .implicit
sits) and then back to the prognostic variables.
On random initial conditions/stochastic parameterizations: At the moment the only random aspect of the model are the StartWithRandomVorticity
initial conditions for the barotropic model. There are some plans to make other parameterizations stochastic, that there'll always be an option to disable that.
Thanks for the detailed and very helpful response @milankl :grin:
I think I've talked with one of you also at last year's JuliaCon in London. Great to see that there's progress!!
Yep I think @DanGiles may have chatted to you.
- v0.5 vs v0.6: I see that you have been using the current release v0.5, however could I ask you to switch to the main branch? I'll release it soon, just adding some documentation these days. Reason is that with v0.6 the interface to changes, as we want to deprecate the
run_speedy!
/initialize_speedy!
interface. You can see examples of the new interface in How to run SpeedyWeather.jl
Okay yep, will switch to using interface on main
branch; I'd noticed this in the process of changing but wasn't sure how imminent a new release would be.
- But good news is that the underlying logic you are worried about regarding prognostic/diagnostic variables and the model struct isn't changing: As you correctly figured out (note to self to document this better!) we have (with small exceptions see below) a) the prognostic variables define the current state of the model, the diagnostic variables are entirely determined by the prognostic variables and the model b) the model is constant at runtime c) Unless the initital conditions are chosen after the model creation the model can reconstruct the prognostic variables
So to your point of "make me unsure" that's indeed as you describe.
Thanks - that's all super helpful to confirm. And it's a credit to how readable your code is that I was able to figure some of this out without it being explicitly documented 🙂
However, the exceptions are as follows
Clock
(which is only in v0.6 btw) sits in model and counts up the time, at the moment this is only used for output. There are conceptually some parameterizations like radiation that would useClock
to determine things like time of day, radiation at that moment and therefore a temperature tendency so that indeedClock
can influence the prognostic variables. This sounds like a good reason to me to move theClock
into the prognostic variables, but if you are worried about this for now: there's currently no active parameterization that usesClock
and so it's only used for output purposes. Going forward, there'll always be a way to disable that, e.g. the radiation code will be able to havedaily_cycle=false
,seasonal_cycle=false
such that theClock
state doesn't matter for the future evolution of the prognostic variables.
That's helpful. Having time-dependent dynamics like this would not strictly be a problem for us as we allow for the state updates to depend on the current time index, but from an implementation perspective it would ease things for us if we can assume that we can reuse the same model setup struct on multiple states within an ensemble without needing to reset the clock state, so if the Clock
struct was moved to the prognostic_variables
that would definitely be helpful for us.
- leapfrogging: Yes I agree with what you say. At the beginning of the model integration only
lf=1
matters andlf=2
is overwritten due to the Euler forward step. During the model integration (i.e. for each time step after the first) bothlf=1
andlf=2
determine the future state of the prognostic variables.
👍
- Yes, the idea is that information from the diagnostic variables propagates from one time step to the next only through the prognostic variables. The tendencies are completely overwritten every time. I do want to stick to that logic and there's currently only two places where we have to be careful (both of which can be disabled): a) Horizontal diffusion: At the moment there's an option to have
HyperDiffusion(;spectral_grid,adaptive=true)
which means that based on the current state of the diagnostic variables, the hyper diffusion "constants" are recalculated to allow for a more selective diffusion which therefore would impact the prognostic variables. But at the moment this does not leave any information left in diagnostic variables/model and is done every time step. b) Semi-implicit time stepping. This requires a vertical temperature profile, which can be chosen as constantImplicitPrimitiveEq(;spectral_grid,recalculate=typemax(Int))
(reminder to self to allowInf
to be passed on too), or can occasionally be recalculated. When recalculated there would be information propagating from a previous diagnostic state to the model (where the.implicit
sits) and then back to the prognostic variables.
Thanks - good to know about these cases, will look at either explicitly disabling or adding some assertions to check these options aren't used so we can safely assume there is no flow of model state through diagnostic variables.
On random initial conditions/stochastic parameterizations: At the moment the only random aspect of the model are the
StartWithRandomVorticity
initial conditions for the barotropic model. There are some plans to make other parameterizations stochastic, that there'll always be an option to disable that.
Again thanks for the clarification!
From my perspective happy to close this issue as you've resolved all my queries, but from labels and comments above looks like you might want to keep this open as a prompt for adding documentation on some of this, so will leave open for now.
I think with #350 closed we can also close this here now! If there's anything else on this topic, feel free to reopen!
Thanks for the fantastic work on this package!
We are interested in using SpeedyWeather with a package ParticleDA.jl we are developing for performing particle filtering / data assimilation. We have an existing integration with the original FORTRAN SPEEDY implementation but as this is a bit brittle and difficult to deploy, and lacking in flexibility compared to SpeedyWeather, it would be great if we could instead switch to using SpeedyWeather instead.
I've been doing some initial explorations of how we would do this, but I'm not sure if my mental model for how the state of models in SpeedyWeather is encapsulated is correct from digging through the source code a bit more. ParticleDA assumes the model of interest is specified as a state space model , with the state of the model at a particular time step fully determined by a vector of some fixed dimensionality. I have been working on the assumption that the state of SpeedyWeather models is fully defined by the spectral coefficients in
prognostic_variables
for leapfrog indexlf=1
, so that a (deterministic) update to a state vectorstate
according to the model could be performed something likewhere
update_prognostic_variables_from_state_vector!
andupdate_state_vector_from_prognostic_variables!
deal with mapping from the (complex) coefficients in the lower triangular matrices in aPrognosticVariables
struct to a flat real vector and vice versa, andprognostic_variables
,diagnostic_variables
andmodel_setup
correspond to the initialized data structures returned byinitialize_speedy
. This assumes that thediagnostic_variables
andmodel_setup
structures are only used to read constants from, write output to not used in subsequent time steps and for holding buffers used for writing intermediate results to. In particular I am assumingdiagnostic_variables
is fully determined by the value ofprognostic_variables
passed torun_speedy!
, but is passed as an input to avoid the need for allocating on each call.The code elements which I've come across so far which make me unsure about the interpretation above:
Clock
struct in themodel_setup
struct, and this is updated as part of the timestepping so that we would need to manually reset the clock data to haverun_speedy!
perform the same computations for a given state. So themodel_setup
struct is at least partially stateful.PrognosticVariables
struct therefore has memory allocated for defining all spectral coefficients at two timesteps. I think based on reading the code insrc/dynamics/time_integration.jl
andsrc/dynamics/initial_conditions.jl
that only the values corresponding to indexlf=1
need to be initialised, with the values for the subsequentlf=2
index bootstrapped by performing an initial pair of forward Euler and (unfiltered) leapfrog steps, so that thelf=2
values in thePrognosticVariables
struct are effectively only an internal buffer that is always overwritten at the start of timestepping. Is this correct?DiagnosticVariables
struct for each of the quantities represented inPrognosticVariables
. From https://github.com/SpeedyWeather/SpeedyWeather.jl/blob/d00b8017eecf2da25ad1801873d7fab09d6ab221/src/dynamics/time_integration.jl#L173-L191 it looks like the tendencies are zeroed at the beginning of each timestep. Does this mean it's safe to assume that any tendencies values written to aDiagnosticVariables
struct in a previous time step will not be accessed / can be safely ignored?As a related question, is it safe to assume a call to
run_speedy!
is deterministic for fixed values of the prognostic and diagnostic variables and model setup? The only places I can currently identify where a random number generator is used is in theinitial_conditions!
method associated withStartWithRandomVorticity
. It looks like from #199 this may change at some point in the future?