Closed kyllingstad closed 4 months ago
In case someone is in the middle of reviewing this, please note that I just pushed a commit which simplifies the changes to slave_simulator
significantly.
In brief, rather than trying to save the entire internal state, which includes cached variable values and modifiers, we leave all state saving to the slave
implementations and FMU code. Simulators which have active variable modifiers will simply refuse to save their state. I was never comfortable with my original attempt, because I wasn't sure that modifiers were properly saved. They're just std::function
objects, which can point to any callable object, and there is no guarantee that copying one actually makes a deep copy of its entire state.
If it turns out that saving state which includes modifiers is a necessary feature, we can revisit it and make a proper implementation later.
I changed the target branch for this from master
to the new dev/state-persistence
branch now. I am splitting the work on #756 and #757 over several PRs so it can be reviewed in manageable chunks, but I worry that I won't have the full picture of the changes needed before everything is done. Therefore, I'd like to keep it out of master
until it's more mature. The dev/state-persistence
branch can be merged into master
when everything is done and we are happy with it.
In brief, rather than trying to save the entire internal state, which includes cached variable values and modifiers, we leave all state saving to the
slave
implementations and FMU code. Simulators which have active variable modifiers will simply refuse to save their state.
The effect of the variable modifier on the simulation will be seen in the saved states, so in principle this omits information necessary for e.g. fully transparent rewind / playback functionality for a whole simulation where these "user actions" must also be tracked. I guess there's also the question of what exactly can happen if the FMU is restored to a previous state, but our cached values aren't.
The data intended to be saved in a void* fmi2_FMU_state_t
is by definition completely unknown by the caller - it's whatever the implementing FMU needs to restore it's state later? In other words there's no guarantee that the data there is suitable for certain uses, like serialization? Does that mean fmi2_capi_serialize_fmu_state
should be used for serializable data instead?
I can't really find any information about how these functions are intended to be implemented by the FMU. Should we cooperate on some example FMUs? Is the Dahlquist
FMU intended for testing the FMU state API?
Another question I have - say we want to save all state by default. Does this affect the current implementation - for example, do we need to start thinking about keeping a circular buffer of states for a configurable duration
, for example? Is this to be done in execution::step
, with saving & restoring state acting as a form of manipulator
, or directly on each model instance within algorithm::do_step
?
Many good questions! I'll try to answer, but first, let me clarify something: This PR is not about serialisation at all. I split #765 into two tasks, where this PR addresses only the first one, namely saving the state in the FMU instance's internal memory. (I am almost done with the second task too, namely to enable serialisation of saved states for individual subsimulators. A PR on this is forthcoming. After that, I'll turn to #757, which is about saving, serialising, and persisting the entire simulation state to disk.)
That said, there is a use case for just being able to save states in memory too: It can be used by "re-stepping" algorithms, e.g. algorithms that roll back the last step(s) to a previous state if the error is too large, in order to repeat them with a smaller step size.
The effect of the variable modifier on the simulation will be seen in the saved states, so in principle this omits information necessary for e.g. fully transparent rewind / playback functionality for a whole simulation where these "user actions" must also be tracked.
I don't think the FMI state saving/serisalisation functions were designed for playback. Their goal is to save the precise simulation state at a certain point in time, so you can
We don't need information about what has happened in the past for either of these use cases, only the complete state of the system at present.
In other words, it doesn't matter if modifiers have been applied and then disabled before we save the state, nor whether we intend to apply some modifiers after we have restored the state again.
I guess there's also the question of what exactly can happen if the FMU is restored to a previous state, but our cached values aren't.
Yeah, that was a challenging point of this work. I have addressed it by calling set_variables()
to transfer all cached values to the FMU instance before saving the state, and by calling get_variables()
to repopulate the cache after restoring the state. That way, I basically hand over the responsibility for saving the variable values to the FMU (which a properly FMI-conforming FMU is supposed to handle correctly).
But that would not work as easily if modifiers were involved, so for now, I just want to forbid modifiers at the save point. We can revisit it later with a more sophisticated solution if the need arises, but for now, I think I'd like to gain some experience with the current, limited solution.
The data intended to be saved in a
void* fmi2_FMU_state_t
is by definition completely unknown by the caller - it's whatever the implementing FMU needs to restore it's state later? In other words there's no guarantee that the data there is suitable for certain uses, like serialization?
From the perspective of the co-simulation master, the fmi2_FMU_state_t
pointer is completely opaque. It is just a handle that we use to refer to a state that has been saved internally in the FMU instance.
Does that mean
fmi2_capi_serialize_fmu_state
should be used for serializable data instead?
"In addition", not "instead". First you save the state to get an fmi2_FMU_state_t
handle, then you pass that handle to fmi2_capi_serialize_fmu_state()
to get a version of the state which is suitable for storage and later deserialisation. Working on it! :)
I can't really find any information about how these functions are intended to be implemented by the FMU. Should we cooperate on some example FMUs?
The FMI Library functions we use here are just wrappers over FMI functions. For example, fmi2_import_get_fmu_state()
corresponds to the FMI 2.0 function fmi2GetFMUstate()
, whose semantics are described in the FMI 2.0 spec.
Is the
Dahlquist
FMU intended for testing the FMU state API?
Exactly. And I'll be using it to test the serialisation API in my next PR.
Another question I have - say we want to save all state by default. Does this affect the current implementation - for example, do we need to start thinking about keeping a circular buffer of states for a configurable
duration
, for example? Is this to be done inexecution::step
, with saving & restoring state acting as a form ofmanipulator
, or directly on each model instance withinalgorithm::do_step
?
I'm not sure what the use case would be for saving all state by default, unless you mean for playback, and then I'll reiterate my statement that that's not what this feature is for. Saving the entire state in each time step would be enormously costly.
I haven't gotten to the point where I'm dealing with the full system and simulation yet, but here are my current ideas:
algorithm
implementations can use the simulator
save/restore state API for restepping, e.g. for error estimation and step size control.algorithm
implementations will have to support serialisation, which will consist of saving and serialising their own internal state, plus forwarding the results of the individual subsimulator save/serialise operations.execution
will gain some functions which can be called to export and import serialised versions of the simulation state.[^1]: This is the use case we have in OptiStress. There, we want to run a large number of simulations from the same starting point, e.g. in an optimisation loop. But for the sake of performance, we'd like to avoid repeating the "warm-up period" before the system reaches the steady state that we'll then perturb.
I don't think the FMI state saving/serisalisation functions were designed for playback.
In fact, I don't think they can even conceivably be used for playback, because the internal state of each FMU instance is just exported as a binary blob, and in general you don't know anything about the format of its contents.
This is the first step towards closing #756. I've added functions corresponding to FMI 2.0's
fmi2{Get,Set,Free}FMUstate()
throughout the various layers of subsimulator interfaces and implementations:cosim::slave
and its implementation incosim::fmi::v2::slave_instance
cosim::simulator
and its implementation incosim::slave_simulator
The API is very similar to the one defined by FMI, except that it represents saved states by numeric indices rather than opaque pointers.
This led me to also remove the
slave_state
andstate_guard
stuff that was inslave_simulator.{hpp,cpp}
. The overloading of the "state" terminology became confusing, and it seemed like it was a lot of code for very little gain. (It was supposed to be a check of correct API usage, but I can't remember it ever actually catching a bug.)Note: This is all about saving states in memory, not about converting them to a process-independent format (serialisation, step 2 of #756) or saving to persistent storage (#757).
This PR also fixes #762.[Edit: Issue #762 has now been independently fixed (in the exact same way) by PR #766.]