Open speth opened 7 months ago
@speth ... this certainly looks like something we should try to make more user friendly.
Regarding use cases: the first issue (ODE integration) could likely be handled by automatically calling syncState
(or something equivalent) at the end? Whenever accessing ReactorBase.thermo
, we already have a restoreState
in place, so that should be covered?
For the second case, we could also do some thing like
reactor.set_TPY(y[...])
instead of
reactor.thermo.set_TPY(y[...])
and handle things internally. The exact terminology/API is obviously debatable.
At the same time, tackling what amounts to a copy constructor for Solution
objects is not necessarily a bad option (would likely be a benefit in other areas also). It's something that is currently missing from the C++ core (although something similar exists for the Python pickle
case).
Regarding use cases: the first issue (ODE integration) could likely be handled by automatically calling
syncState
(or something equivalent) at the end?
At the end of what? At the end of step()
or advance()
, if multiple reactors share the same Solution
, which reactor should that Solution
's state correspond to?
Whenever accessing
ReactorBase.thermo
, we already have arestoreState
in place, so that should be covered?
This is the main problem: While the thermo
property in Python does this synchronization, there's nothing that prevents the user from inadvertently accessing the Solution
object directly without using the thermo
property.
Don't read too much into my pseudocode structure. I wrote it with some of the functions effectively "inlined" for what I was hoping would improve clarity, but perhaps didn't. I fully expect the manipulation of the ThermoPhase
object to be encapsulated within a method of ReactorBase
, which will probably still be updateState
.
This is the main problem: While the thermo property in Python does this synchronization, there's nothing that prevents the user from inadvertently accessing the
Solution
object directly without using the thermo property.
This is indeed the crux of the problem. From where I stand, it is poor practice if a user has expectations that a shared object should hold a specific state (or attempts to use that implicit pathway to update reactor contents). It is unfortunate that the current design paradigm allows for this. It is getting 'abused' and people report 'unexpected behavior'. Overall, I don't think that this is a path we should encourage.
My concern is that disallowing shared objects in some cases will break existing code (example: it is fairly common to use the same Solution
object to initialize several reactors).
As we're already breaking things, we may as well create a clean break and shift to a different paradigm where Solution
objects are copied, and cannot be shared (i.e. ReactorBase
calls a copy constructor for a new Solution
object, rather than attaching the previously created shared object). I would prefer this path as this would not break code where one object is used to initialize multiple reactors (a legitimate use case), while putting an end to bad coding practice (expecting a magic link of Reactor
and Solution
objects that are not ReactorBase.thermo
). This approach would then also apply to 1D objects as well as SolutionArray
objects.
@ischoegl commented in https://github.com/Cantera/cantera/pull/1788#discussion_r1752775764:
Could you expand? My thinking was a strict hierarchy where the ReactorNet is responsible for Reactor, which themselves are responsible for Solution. I am not sure that I see why we should give ReactorNet the ability to muddle with reactor contents?
The reason I suggested having ReactorNet
manage cloning Solution
objects is based on the concept that I initially suggested in this Enhancement: that we clone Solution
objects only when there are multiple reactors sharing the same Solution
. There are a couple of reasons for this:
Solution
objects shared between reactors, with many networks only consisting of a single reactor. In this case, cloning the Solution
object introduces unnecessary overhead. While it's possible that overhead is small compared to actually integrating the system, I'd rather not have the Solution
<-> AnyMap
code become too performance critical.Solution
objects, then we are requiring the Solution
<-> AnyMap
round trip to be fully implemented for every model that makes up a Solution
, which I think is a substantial expansion of the set of functions that models must implement. While this is mostly (bot not entirely) satisfied by the classes that are built into Cantera, I think it would be a burden to require this for all user implemented models.The reason I suggested having
ReactorNet
manage cloningSolution
objects is based on the concept that I initially suggested in this Enhancement: that we cloneSolution
objects only when there are multiple reactors sharing the sameSolution
. [...]
I am not sure that I agree: from my perspective, a lot of the issues with 'traps' this issue is documenting arise due to allowing 'sharing' at all. As I have stated in an earlier comment above, I think that the expectation to create a Solution
, link it into a Reactor
and then build code with the expectations that modifications of either the former directly, or the latter via the thermo
property ‘just works’ is fundamentally flawed. The need to use syncState
etc. imho makes the overall implementation brittle by design.
- I expect that the majority of usage of the reactor network model does not make use of
Solution
objects shared between reactors, with many networks only consisting of a single reactor. In this case, cloning theSolution
object introduces unnecessary overhead. While it's possible that overhead is small compared to actually integrating the system, I'd rather not have theSolution
<->AnyMap
code become too performance critical.
Most of the instances where a Solution
object is recycled are for Inlet
, Outlet
, Reservoir
, etc. where no Kinetics
is required. I believe that by creating partial Solution
(without Kinetics
) the overhead can be somewhat reduced. Overall, I strongly believe that a small added overhead is a price that is well worth paying to have 'cleaner' and thus more robust code.
- If we always clone the
Solution
objects, then we are requiring theSolution
<->AnyMap
round trip to be fully implemented for every model that makes up aSolution
, which I think is a substantial expansion of the set of functions that models must implement. While this is mostly (bot not entirely) satisfied by the classes that are built into Cantera, I think it would be a burden to require this for all user implemented models.
I think that we should consider a switch of paradigms here. I believe that we are 'oversharing' some of the objects we use. In the case of Solution
, a unique_ptr
may be preferable. Regarding copying objects, I think that - once we get rid of all the raw pointers - we should be able to start allowing for regular copy constructors (rather than the AnyMap
serialization route). Regardless, I believe that having round-trip conversion for Solution
implemented for all models isn't necessarily a bad goal to have.
I think that overall, there appear to be some fundamental issues that need further discussion. I'd suggest to shelve Cantera/cantera#1788 for the time being. While this may prolong deprecation cycles, I don't want this to unnecessarily delay the release of 3.1.
As a friendly amendment to the evaluation logic posted on top, I am proposing the following pseudo-code, which assumes existence of universal Connector
objects, per Cantera/cantera#1788:
def eval(t, y, ydot):
# ReactorNet::updateState
for reactor in all_reactors:
# Reactor.updateState
reactor.thermo.set_TPY(y[...])
# Calculation of reactor-local variables is optional, for efficiency only
# new handling of reactor interactions
boundary_transfers = np.zeros_like(ydot)
for connector in all_connectors:
# update interactions across reactor boundaries
connector.update(boundary_transfers[...])
# ReactorNet::eval
for reactor in all_reactors:
# Evaluation can directly access thermo from this reactor only (!)
ydot[...] = f(reactor.thermo, reactor.kinetics, reactor.local_vars)
ydot += boundary_transfers
This approach would ensure that cyclic pointer references are avoided by design. The underlying ReactorNet
has shared_ptr
to all Reactor
and Connector
objects. Connector
objects hold shared_ptr
to their neighboring reactors as well as their associated thermo objects to enable efficient calculation of fluxes, but Reactor
objects receive net transfer (fluxes times area) from ReactorNet
(if they even need them at all). Reactor
objects can still hold weak_ptr
to connectors for non-critical queries (i.e. whatever is not used during time integration).
Beyond, I believe this would also be a workable approach to resolve #202. Considering #61, there is the interesting question on how to handle reactor surfaces ... while I had reclassified ReactorSurface
as a node in Cantera/cantera#1788, there is an alternative where it could be recast as a Connector
(which would share some attributes with an inert Wall
) associated with a node. It becomes even more interesting for ReactorEdge
as envisioned in #203 / Cantera/cantera#1697. PS: I believe that #202 would involve redistributing current ReactorSurface
methods between the 'node' and a future 'connector' variant.
Regarding copying objects, I think that - once we get rid of all the raw pointers - we should be able to start allowing for regular copy constructors (rather than the
AnyMap
serialization route). Regardless, I believe that having round-trip conversion forSolution
implemented for all models isn't necessarily a bad goal to have.
Providing regular copy constructors would introduce a ton of additional boilerplate code, not to mention testing requirements. I'd really like to see how effective the AnyMap
serialization route is first. I disagree that getting rid of all use of raw pointers should be a goal for Cantera or in C++ in general. The important thing is to avoid manual management of object lifetimes using new
and delete
directly, which we already do. Beyond that, both shared_ptr
/unique_ptr
and stack allocation of objects provide good ways of managing object lifetimes.
I agree that we should provide working serialization/round trip capabilities for all models that are provided as part of Cantera. What I'm concerned about is creating a technical requirement for that capability that applies not only to the built-in models but all user-provided models as well (including those implemented using ExtensibleReaction
). If a user has to implement full serialization / deserialization just to write the thermodynamic model equivalent of "hello world", I think we're creating a pretty rough barrier to entry, and that's the opposite of my goal of making it easier for users to introduce their own models.
Most of the instances where a
Solution
object is recycled are forInlet
,Outlet
,Reservoir
, etc. where noKinetics
is required. I believe that by creating partialSolution
(withoutKinetics
) the overhead can be somewhat reduced. Overall, I strongly believe that a small added overhead is a price that is well worth paying to have 'cleaner' and thus more robust code.
That's true that you could avoid cloning Kinetics
objects for Reservoir
objects, but you'd still have to do so for the single Reactor
that is present in many reactor network simulations. Again, I think once we have an AnyMap
-based clone
function, we can do a little bit of performance testing and then we'll have an idea of what the overhead of this looks like for typical reactor network simulations.
This approach would ensure that cyclic pointer references are avoided by design. The underlying
ReactorNet
hasshared_ptr
to allReactor
andConnector
objects.Connector
objects holdshared_ptr
to their neighboring reactors as well as their associated thermo objects to enable efficient calculation of fluxes, butReactor
objects receive net transfer (fluxes times area) fromReactorNet
(if they even need them at all).Reactor
objects can still holdweak_ptr
to connectors for non-critical queries (i.e. whatever is not used during time integration).
I like the idea of moving the calculation of Connector
"transfers" out of the specific reactors. Among other things, this approach has the advantage of only evaluating each connector once, unlike the current approach where functions like updateMassFlowRate
are called by both the upstream and downstream reactor each time the governing equations are evaluated. The one question that comes to mind for me is how to extend this approach to allow for calculation of the cross-reactor Jacobian terms, which was started in Cantera/cantera#1634 but not quite finished.
Beyond, I believe this would also be a workable approach to resolve #202. Considering #61, there is the interesting question on how to handle reactor surfaces ... while I had reclassified
ReactorSurface
as a node in Cantera/cantera#1788, there is an alternative where it could be recast as aConnector
(which would share some attributes with an inertWall
) associated with a node. It becomes even more interesting forReactorEdge
as envisioned in #203 / Cantera/cantera#1697. PS: I believe that #202 would involve redistributing currentReactorSurface
methods between the 'node' and a future 'connector' variant.
Yes, I can see how this approach to the connectors makes a ReactorSurface
into a type of Connector
sitting between two Reactor
s (or a Reactor
and a Reservoir
representing an inert bulk material that the user may not actually have to define). This would require extending the system to consider the idea that a Connector
can have governing equations that have to be solved as part of the system, but I think that's a generalization that we've needed for a while, to support for example things like walls with inertia. I'm not quite sure where that would leave ReactorEdge
, though, as it is effectively a connector between three surfaces and three bulk phases where transfers among all of those phases are resolved simultaneously -- that doesn't seem to fit cleanly with the idea of a Connector
only linking two ReactorNode
s.
@speth ... thanks for your constructive comments. I believe that overall it should be relatively straightforward to work out a consensus for a path forward. I am totally :+1: with sticking to the clone
approach. Regarding raw pointers, you are correct that we have already eliminated most owning raw pointers; I still like the idea of moving towards smart pointers where it makes sense. Stack allocated memory is really the only place where there's no good way around, although some of your tests from a couple of years ago indicated that there wasn't much of a speed penalty for cases where it would really matter (see https://github.com/Cantera/cantera/issues/1211#issuecomment-1065991448).
Abstract
For a
ThermoPhase
in use by aReactor
, the state of theThermoPhase
does not necessarily correspond to the state of theReactor
at the current time of theReactorNet
. This can lead to users extracting incorrect state information during integration (see this UG post, for example). In addition, the current requirement that a singleThermoPhase
can be shared across multipleReactor
s increases the complexity of evaluating theReactor
's governing equations.Motivation
After an integration step by
ReactorNet
, the state of theThermoPhase
object will generally be set to the latest time reached by the CVODES integrator, $t_{int}$. However, this time may be beyond the output time specified by the user (using theadvance(t_user)
method). If the user accesses theThermoPhase
object directly, they will get the state at $t{int}$, in contrast to getting the state at $t{user}$ by accessing it through thereactor.thermo
property, in which CVODES will provide a result interpolated back to the correct time. For simple usage, like plotting the state as a function of time, getting the state at a slightly different time won't noticeably affect results. However, for cases like trying to evaluate sensitivities using a finite difference approach, two reactor networks will not have reached the same value of $t_{int}$, introducing a large amount of noise. This is the phenomenon dubbed "Bjarne's Trap" by @rwest, and has been a source of confusion for a long time.Because the current structure allows multiple reactors to share a
ThermoPhase
object, there is a significant amount of complexity in the logic for evaluating the governing equations for all reactors in the network.The current evaluation logic amounts to the following (in Python pseudocode):
This complexity has resulted in difficulties for users who want to implement modified governing equations using the
ExtensibleReactor
framework. For example, see this problem posted on the UG, and this one too.Possible Solutions
I think the solution to both of these problems to stop having multiple reactors share
ThermoPhase
/Kinetics
objects, and to haveReactorNet.step
andReactorNet.advance
automatically synchronize the state of the reactors after each call (by default).Thanks to the implementation of
AnyMap
-based serialization, we can still allow users to use a singleSolution
object to set up multiple reactors. When initializing theReactorNet
, we can then check to see if anySolution
object is being used multiple times, and then clone it to create a distinctSolution
for eachReactor
.thermo
properties of the reactor objects. We will still automatically synchronize these after each step -- not rely on the magic call tosyncState
when the getter is used.The automatic synchronization can be disabled by passing a
sync=False
option toadvance()
orstep()
.The "simple" alternative is to force this on the user -- we just check and tell the user that they can't use the same
Solution
object multiple times, but I think the automatic cloning is much more user friendly and shouldn't be that difficult to implement.After this change, the evaluation logic would look more like:
I'd like to make sure that any changes to how the work of evaluating the reactor governing equations is divided up would provide relatively clean solutions to a few scenarios that have come up on the Users' Group. Namely:
ExtensibleReactor
override override ofupdate_state