Automatic tracking of `ReactorNet` state

Cantera / enhancements

Repository for proposed and ongoing enhancements to Cantera

11 stars 5 forks source link

Automatic tracking of `ReactorNet` state #206

Open corykinney opened 3 months ago

corykinney commented 3 months ago

Abstract

The proposed enhancement is additional functionality for ReactorNet objects to automatically construct and update SolutionArrays for each Reactor object in the network and to provide a convenient interface for accessing the state of the network at snapshots in time and the state of each reactor for the duration of the simulation.

Motivation

Currently, users wishing to track the state of Reactor objects must construct a SolutionArray object and manually append the reactor's thermodynamic state as desired. This is not too burdensome with a small number of reactor objects and when calling step manually; however, this makes it difficult to track network state at each time step when using other routines such as advance_to_steady_state that control the timestepping.

Possible Solutions

Now that SolutionArray is implemented in C++, it is possible to have a ReactorNet object that automatically constructs SolutionArray objects for each Reactor object. Here are some considerations about how this could be implemented and what behavior might be desired:

This functionality could be directly added to the ReactorNet class - an optional flag could be used to enable this functionality with it disabled by default. Are there any reasons why this would need to be separate?
Time data can be saved automatically as the extra variable t in each SolutionArray objects - are there any use cases that would necessitate support for additional extra variables?
The underlying SolutionArray objects could be made accessible through the ReactorNet by index number - would it be worth the additional complexity to support user-assigned names for reactors to make access more intuitive?

References

Relevant Users' Group topic

corykinney commented 3 months ago

@speth Here is a draft of the feature discussed. If it's decided that this is a worthwhile contribution, I'd be happy to work on this once the best approach is agreed upon.

speth commented 3 months ago

Time data can be saved automatically as the extra variable t in each SolutionArray objects - are there any use cases that would necessitate support for additional extra variables?

The most important additional variables that come to mind would be the mass and/or volume of each reactor's contents. You could also consider whether there's a reasonable way to store quantities related mass flow controllers / valves / walls etc, although those aren't always uniquely associated with a particular reactor.

ischoegl commented 3 months ago

Hi @corykinney ... great to see this popping up! A functionality like this was one of the motivations for porting SolutionArray to C++ in the first place, but I never got around to adding this although there's not that much work left (at least, compared to the port itself). Your draft overall looks good:

This functionality could be directly added to the ReactorNet class - an optional flag could be used to enable this functionality with it disabled by default. Are there any reasons why this would need to be separate?

There aren't any reasons I can think of. You probably need a dedicated method to pass settings, but that's about it.

Time data can be saved automatically as the extra variable t in each SolutionArray objects - are there any use cases that would necessitate support for additional extra variables?

Time data should be first column (similar to position being first column in the oneD version). Beyond, there are use cases for additional variables - have a look here https://github.com/Cantera/cantera/blob/b2c0af526fdbb7f99de1c53f55769681599145cd/samples/python/reactors/ic_engine.py#L159-L163 as well as many of the other examples (fwiw, each of those examples should use newly implemented methods). As some of extra variables may require side calculations, you may have to look at callbacks from Python - these can be handled in a similar fashion to what's done for user-defined wall velocities etc., where some of @speth's comments can be explored. Beyond, I would suggest to leave it up to the user to select what needs to be stored outside of settings that aren't ambiguous. SolutionArrays are mostly collections of data, although you can add metadata (time steps, tolerances, etc.) for automated documentation purposes - see what is done for oneD... the nice thing is that you can tap into existing infrastructure for saving to HDF, YAML, etc..

The underlying SolutionArray objects could be made accessible through the ReactorNet by index number - would it be worth the additional complexity to support user-assigned names for reactors to make access more intuitive?

I believe user defined names should remain the default. I implemented something similar for oneD which you should be able to use as a template.

https://github.com/Cantera/cantera/blob/b2c0af526fdbb7f99de1c53f55769681599145cd/include/cantera/oneD/Sim1D.h#L116-L144

corykinney commented 3 months ago

The most important additional variables that come to mind would be the mass and/or volume of each reactor's contents. You could also consider whether there's a reasonable way to store quantities related mass flow controllers / valves / walls etc, although those aren't always uniquely associated with a particular reactor.

I can't think of a deterministic approach for where and what to store for MFC/valve/wall quantities, but if Python callbacks are implemented, as @ischoegl suggested, in a streamlined way, perhaps we can leave it up to the user to define if they want to record any of that data and attached to which reactor's state.

The user could define a single callback function that returns a dictionary of column names and the value for that timestep. For the IC engine example it could look like:

...

def cylinder_state():
    dWv_dt = - (cyl.thermo.P - ambient_air.thermo.P) * A_piston * piston_speed(sim.time)
    return {
        "mdot_in": inlet_valve.mass_flow_rate,
        "mdot_out": outlet_valve.mass_flow_rate,
        "dWv_dt": dWv_dt)
    }

which could be set as the callback for the corresponding reactor's state. It seems like a streamlined way to include desired MFC/valve/wall quantities as well as user-calculated properties.

ischoegl commented 3 months ago

@corykinney ... I believe this is mostly workable, with a tweak.

The user could define a single callback function that returns a dictionary of column names and the value for that timestep.

At least for some of the simpler solutions, you will face the limitation that callbacks return scalars (based on the C++ /Python Func1 implementations), i.e. an API that is similar to: https://github.com/Cantera/cantera/blob/7a69a6e3c6c22e0e5399296c93136bb34d6c1452/include/cantera/zeroD/Wall.h#L125-L130 and https://github.com/Cantera/cantera/blob/7a69a6e3c6c22e0e5399296c93136bb34d6c1452/interfaces/cython/cantera/reactor.pyx#L1159-L1178 I.e. the simplest way I can think of would involve dictionaries of Func1, that themselves return scalars evaluated at each time step. (As an aside, the implementation of the Python Func1 interface is clever enough that most people won't realize that it's even there).

speth commented 3 months ago

I can't think of a deterministic approach for where and what to store for MFC/valve/wall quantities, but if Python callbacks are implemented, as @ischoegl suggested, in a streamlined way, perhaps we can leave it up to the user to define if they want to record any of that data and attached to which reactor's state.

I think there's a tradeoff between simplicity and flexibility here. If you need full flexibility, there's always the existing approach of collecting the relevant data as part of the integration loop and adding it to the SolutionArray each step. The other end of the spectrum would be just setting a few boolean flags for what quantities to store.

As far as where to store the properties, I think you can make a pretty simple choice and associate the wall and flow device variables with the first/left reactor specified when creating the the connector. All you need to store for a flow device is the mass flow rate. For walls, I think it's just the velocity and heat transfer rate.

If you do want to use a callback to populate user-specified columns, the Delegator class, used to implement ExtensibleReactor and ExtensibleRate, provides a more general approach to defining callbacks with different function signatures than the Func1 class provides.

ischoegl commented 3 months ago

I think there's a tradeoff between simplicity and flexibility here. If you need full flexibility, there's always the existing approach of collecting the relevant data as part of the integration loop and adding it to the SolutionArray each step. The other end of the spectrum would be just setting a few boolean flags for what quantities to store.

@speth ... agreed on the tradeoff - I believe it makes sense to explore the middle ground. The status quo is clunky, whereas Delegators are quite complex, which is why I suggested Func1 - after all, each column is populated by scalars (I don't see many applications where strings or integers are required) and they are quite unobtrusive (i.e. callables can be converted 'under the hood' which ensures that writing of extra columns remains accessible to inexperienced Python users). Boolean flags may work for some properties, but, - as the IC engine example demonstrates, - are insufficient for a generic API.

As an aside, doing a 'restart' from a collection of SolutionArrays as was done for oneD is quite a bit less complex for zeroD, as it only involves a single time step. Perhaps the answer to what needs to be stored should be based on essential data needed to restore an existing ReactorNet to a given state, e.g. TPY plus volume? Wall and flow device values are an edge case where I'd be in favor of retaining values for informational purposes. Whatever can be calculated beyond should remain optional and thus user-defined?

(fwiw, structural information is beyond the scope of SolutionArrays, as they are not meant for that purpose; there are better approaches if this is really needed, e.g. Cantera/cantera#694 or #180/Cantera/Cantera#1624?)

ischoegl commented 2 months ago

Fwiw, Cantera/cantera#1765 implements auto-generated unique names for unnamed zeroD objects at the C++ level that are reproducible. I believe this change should help with defining a suitable SolutionArray storage hierarchy.

PS: one roadblock to this enhancement is that ReactorSurface currently doesn't own a Solution (Interface) object that would be required for serialization.