Initial Mode should be defined per Base Component, with the Initial Expression being global to a configuration

verivital / hyst

HyST: A Source Transformation and Translation Tool for Hybrid Automaton Models

http://verivital.com/hyst/

Other

15 stars 18 forks source link

Initial Mode should be defined per Base Component, with the Initial Expression being global to a configuration #10

Open stanleybak opened 8 years ago

stanleybak commented 8 years ago

The way initial modes are stored now in the internal representation is there's a Map <String, Expression> in the Configuration object, where keys are the mode names. This is a problem because it makes it hard to check if a particular mode of a sub-automaton is in the initial set, since the name will be a concatenation of the names of all the initial modes in the automaton. A second (smaller) problem is that when initial modes are NOT defined for automata, the semantics of SpaceEx are that every mode is initial, which can lead to exponential blow-up in the number of initial states if we enumerate them like this.

In SpaceEx input, you give initial values for the states as an expression, which contains loc(instance_name)=mode_name subexpressions. I propose we change the internal representation to reflect this. Therefore, the initial modes would be split, part of it being in the Configuration object (the Expression), and part of it being in each BaseComponent (the modes which are initial). Additionally, BaseComponents which have no loc assignments in the initial expression mean that EVERY mode in the automaton is a valid initial mode. Some printers do not support this (Flow*), so I propose we add a precondition conversion from these models to ones with a single initial mode an an urgent transition to every mode in the automaton.

This change to the IR is likely to involve updating every printer and many passes, but I'm confident we have enough tests to make sure we do it correctly. I would need your support @ttj to make sure the Stateflow printers are updated to reflect the new intermediate representation. What do you think?

ttj commented 8 years ago

Some comments, I generally agree it's a little annoying the way it currently is set up.

1) I think I may prefer locations have a flag to be initial or not I think, which is e.g. set to true if they're initial. I could see some minor issues doing this for instantiations of components vs. the templates, but probably not too big of an issue.

2) I see value in keeping things around that contain both locations and expressions, at least from a parsing standpoint (e.g., probably conversion methods from the < String,Expression > to properly set all the modes would be good). I would probably prefer something of the form < Location,Expression >, I think generally using objects for most of these things is idea (likewise for places where we index various sets/maps with variable names instead of objects).

3) I'm generally opposed to doing anything that requires some notion of urgent transitions, as I think this will make things confusing for other users (and e.g. I don't know how to use these myself currently or how/if they're represented in the IR). Does Flow* not allow us to set more than a single initial location, is that the issue? If so, I understand the suggestion, but I'd prefer to not call this "urgent transition" :-)

stanleybak commented 8 years ago

The disadvantage of having it be a flag like in (1) is that say I modify the automaton and create some new modes. Should these new modes be initial? If there was no initial state defined for this automaton in the original initial spaceex expression, then yes, every mode should be initial. If there was an initial state defined than no. With a boolean flag in each AutomatonMode, I'd need to loop through all the modes and see if more than one has its initial flag set. What if there is only one mode? Then the two cases are indistinguishable.

I think your (2) was cut off somehow, not sure what you're suggesting.

As for (3), the way Preconditions work now is that unless your pass or printer explicitly marks certain types of hybrid automata as supported, they get either converted or rejected. In this way you can assume all models you try to print won't have urgent transitions; if a model tries to be converted which has them, a PreconditionsException gets thrown. If you want to add support for them, then you lookup how it's done in the IR, write the code, and finally in your printer / pass you can set the preconditions object to skip the urgent check. The same thing with other things which may not be supported, like explicit nondeterministic dynamics, or network automata, although for these cases if you don't have explicit support a conversion automatically happens (introducing new variables in the former case, doing flattening in the latter).

Yes Flow* only supports a single initial mode. Actually it doesn't support urgent transitions either, but you fake it with an \epsilon transition.

ttj commented 8 years ago

Can we do both? Have the expressions, but then have a function flag isInitial for locations, which returns whether this particular mode is set as initial in the initial expression? I think this will remove a lot of duplicated code, as the most common use case is to check if a particular mode is initial or not.

Gotcha regarding Flow*, so you'd have a new (fake) location that is initial with transitions without any guards/resets to each mode, yes?

stanleybak commented 8 years ago

for Flow*, yes.

"a function flag isInitial for locations" Not sure what you mean. I think we'd have the expression (in the configuration object), and a function BaseComonentInstance.isInitial() to return true or false if its in the initial mode for that base compoent (rather than a boolean variable you directly read). Would this work?

ttj commented 8 years ago

So, are we talking about modes or automata/instances here? I think a mixture.

I think each mode in an instance should have something like:

bool isInitial(InitialExpression e) { if this.mode is set in e: return true; }

So, if we have an an instance of an automaton, it would return true if that mode is in the initial expression. The templates of the automata don't need the information, but I think it should be available for each instance. Perhaps we need to have two classes for modes: one for the template automata and one for modes in instances, since it shouldn't be defined in the template automaton's modes?

Example:

Expression: loc(instance_name)=mode_name

isInitial would return true for instance_name if called from the mode mode_name

stanleybak commented 8 years ago

A BaseComponent doesn't (immediately) know its full instance name, due to the possibility of nested network components. For example, say I have three components network_top, network_middle, and base_bottom, organized hierarchically. network_top is the top level, so it doesn't have an instance name. network_middle's instance name is network_middle_i, and base_bottom's instance name is base_bottom_i.

In this case, in the initial loc expression you might have loc(network_middle_i.base_bottom_i)=mode_one. When you call your proposed isInitial function it wouldn't see the automaton's instance name (which is base_bottom_i) as matching network_middle_i.base_bottom_i. If you just take the last element and check for instance name matches it might not be sound, for example if there are multiple network_middle components instantiated with different initial modes in base_bottom. You could work your way up the hierarchy, at each network component concatenating its instance name until you get to the root.

I would propose having a private variable in the BaseComponent String initialMode; which is set to the name of the initial mode, or null if all the modes are initial. Then the method you'd in an AutomatonMode is just

bool isInitial()

or in the BaseComponent:

bool isInitial(String modeName) void setInitial(String modeName)

Also, you seem to imply that you want to keep around the original initial state Expression, where I'm proposing splitting it up into the variable expression part and some internal representation of the initial modes. Is there any reason you want to keep the full intial expression?

ttj commented 8 years ago

Gotcha, that sounds good. I didn't know SpaceEx could support such hierarchical models, although yes, I meant to do this transitively a la the transitive closure of checking if a mode is initial or not. We should highlight this in our next paper, as in practice supporting hierarchical hybrid automata is a contribution (I think way back in the early 2000s Charon did this).

One concern from the SpaceEx side is that it does support having multiple initial modes, which from your comment ("the initial mode") you maybe aren't considering, which can be set as:

loc(automaton) = a | loc(automaton) = b

And more generally with a disjunct of conjuncts to specify different initial states, e.g.:

(loc(automaton) = a & x1 >= 0 & x1 <= 5) | (loc(automaton) = b ... )

I think keeping around the expressions in general may be useful, or at least somehow keeping around the set of initial conditions. Actually, I think this is the way to go: we don't necessarily want Hyst to use exactly the same SpaceEx semantics internally (e.g., I find that if no initial modes are set, then use all a little weird, and would prefer we enumerate all the initial modes).

I just tested to make sure this works and did so successfully with the buck converter:

(loc(buckboost_template_1)==charging & loc(controller_1)==charging_controller & il==0 & vc==0 & t==0 & Vs==24 & tmax==0.0075 & mode_out == 2 & VcH == 12.1 & VcL == 11.9) | (loc(buckboost_template_1)==discharging & il==0 & vc==12 & t==0 & Vs==24 & tmax==0.0075 & mode_out == 1 & VcH == 12.1 & VcL == 11.9)

output-15898817-0

ttj commented 8 years ago

And actually, SpaceEx just supports arbitrary (convex) expressions it seems, i.e., the following is a valid initial condition:

(loc(buckboost_template_1)==charging & loc(controller_1)==charging_controller & il==0 & vc==0 & t==0 & Vs==24 & tmax==0.0075 & mode_out == 2 & VcH == 12.1 & VcL == 11.9) | (loc(buckboost_template_1)==discharging & il==0 & (vc==12 | vc == 12.05 | vc == 11.95) & t==0 & Vs==24 & tmax==0.0075 & mode_out == 1 & VcH == 12.1 & VcL == 11.9)

output-385125937-2

As such, what if we had complex combinations of initial conditions specified in the expression over the location? Should we consider having a canonical representation of all this (e.g., force into DNF first?)? This is the concern with keeping the expressions around. For example, consider the following:

(loc(buckboost_template_1)==charging & loc(controller_1)==charging_controller & il==0 & vc==0 & t==0 & Vs==24 & tmax==0.0075 & mode_out == 2 & VcH == 12.1 & VcL == 11.9) | (((loc(buckboost_template_1)==discharging | loc(buckboost_template_1)==charging) & il==0 & (vc==12 | vc == 12.05 | vc == 11.95 | vc == 5) & t==0 & Vs==24 & tmax==0.0075 & (mode_out == 2 | mode_out == 1) & VcH == 12.1 & VcL == 11.9))

output-1938192333-0

stanleybak commented 8 years ago

I think I had code that could work with 'or' expressions as part of the initial condition. One problem that comes up is that we'd have to change the way constants are handled. Now there's a set of constants for each component. By defining initial states with 'or' expressions, you can define different values for constants, so storing them per component wouldn't work. You almost would need to create variables for each constant, with zero dynamics, which would work but would get rid of the chance to do automaton simplification and probably hurt scalability.

In my opinion, the cleanest formalism is having a single initial mode and single forbidden mode, and then using resets and guards to create the initial states and transition to the forbidden mode. But that might be too drastic, so I think currently we just reject initial expressions that have an 'or' operation.

ttj commented 8 years ago

Hmmm, so, while there is perhaps not a ton of value adding this feature to support disjuncts (although it would help to automate analysis of systems that don't have immediate support for this in other tools), it's perhaps not good that we don't support it while SpaceEx does.

I think SpaceEx represents expressions as symbolic regions, so it's probably a pair: (Location, Reals), where Reals is a convex constraint defining a subset of the reals. In practice, this can be done with a DNF conversion, although there are some practical issues that can arise (e.g., scalability).

I prefer the DNF expression route myself. I can see some value supporting single modes, but this gets very restrictive quickly compared to using expressions/symbolic regions. If not DNF, then some canonical representation of expressions, where we can check if a given mode is initial by looking into the expression (and possibly pulling out the relevant state variables initial states and constant values).

I understand the constants issue you've brought up, but I think the initial states can reasonably be represented be a set of:

Set where SymbolicState is a pair with a location and a convex constraint

If done like this, then there would be a set of constants for each instance, as well as a set of initial states (these pairs) for each instance. This is perhaps something you'd prefer to support via hypy (since for some tools this may require printing many times or handling via the initial condition hack you've mentioned), but maybe we can do something similar for constants like this (e.g., one model generates a set of models, or, if the semantics support it and it's reasonably performing, have an initial mode where we set the constant values on outgoing transitions to each mode in the set of initial modes?).

If done in DNF, it's fairly clean: (loc = a /\ x = 1 /\ y = 2 /\ ...) \/ (loc = b /\ x = 2 /\ ...) etc.

Obviously the example I put above is more complex and has nested disjuncts, but it seems SpaceEx works fine on it.

Maybe we should ask Goran how things are done in SpaceEx to get some inspiration? I presume it's symbolic states basically like I mentioned.

We don't have to restrict to convex if we really want (e.g., could allow disjuncts in expressions), but I think representing initial states as sets makes the most sense and is closest to the mathematical semantics typically used (sets of initial states).

stanleybak commented 8 years ago

Hmm, I think the idea we've settled on is something like each Component, whether it's a BaseComponent or NetworkComponent, will maintain a part of the initial states, a set of symbolic constraints. Since network components don't have any modes, their initial state would be limited to assignments to constants. BaseComponents could include both assignments to constants, as well as defining an arbitrary set of initial modes. So the data structures are something like:

// inside the Component class
Collection <InitialState> init; // disjunction of initial states

where InitialState has:

// inside the definition of the InitialState class
Expression constraint; // (convex) constraints on variables
Map <String, Interval> constants; // maps names -> assignments for constants
Set <String> initialModes; // set of mode names which are initial (base component only)

The only difficulty with this type of representation is that it's hard to get a global statement of the initial states without iterating over the hierarchy in Configuration.root. For example, we can convert from a SpaceEx initial assignemnt expression to the IR, but then if we want to convert back it's non-trivial. Nonetheless, we could write a helper method in Configuration which would return an Expression which does this. Then, we wouldn't need to keep around the original initial assignment expression. If you really want to keep around the original expression, I would propose adding a source variable to the Configuration object which would store the original SpaceExDocument, or whatever original object was used to create the IR (if we add imports from other formats, for example). Until there's an application that needs this, though, I would hold off (do you have an application in mind?). Is this plan acceptable?

stanleybak commented 8 years ago

Thinking more about this, I like the idea of having a standard form:

In my opinion, the cleanest formalism is having a single initial mode and single forbidden mode, and then using resets and guards to create the initial states and transition to the forbidden mode.

The initial mode would be urgent, and the outgoing transitions would have resets to define the initial states in each mode. The forbidden states could have multiple incoming transitions for each of the error conditions. The error mode's invariant would be true, and its expression would be 'true', so that whenever it's entered an error is detected.

For a hierarchical model, these would be defined for every base component. The importer could do the conversion, and it would be up the printers to convert if back, if desired (I don't think it's necessary since this transformation preserves reachability).

The only problem might be that not all tools support urgent (initial) modes, so conversions back might be required. We could generalize this conversion back using a utility function so that the printer modifications would be minimal. The easiest conversion back for the initial states would be to define every state with a transition from the urgent initial states as initial, and use the transition's guard as the expression. This could be optimized further by first checking for satisfiability between the transition's guard and destination's invariant.

With this, we could get rid of initial and forbidden from the intermediate representation, which would be cleaner. It also makes writing certain transformation passes easier, where if you want to apply an action to all incoming transitions of a mode, currently you need to have extra checking to see if you're in an initial state.

ttj commented 8 years ago

Hmmm...

Yes, I'm not sure. I somewhat dislike the urgent things as it makes the models messy. I wouldn't mind this as a pass that can create this or whatever. (Similar to having a pass that would add a timer automaton and flatten for doing time-bounded reachability more easily and cleaner than having timer variables in models...)

Just to remind me, are all the issues due to networks, and in particular, having constants versus initial conditions defined in different ways for different subcomponents?

stanleybak commented 8 years ago

that was the initial issue yes. Although there are other issues (having to consider mods that are initial in special ways for transformation passes) that would be solved by using the standard form. We could do the conversion back before the printers see it, so it would strictly be part of the intermediate representation (so printers wouldn't need to handle urgent modes).

ttj commented 8 years ago

As long as the external parts don't have to see this, I think it's fine, especially if it really helps out internally to make things simpler. I would also comment it's best to have not just the printers not see it, but have some "canonical" representation for API access, etc., or a function call to get it to a format where this doesn't matter (e.g., for benchmark generators, the Stateflow converter, etc.).

stanleybak commented 8 years ago

What I was function is there would be a utility function Map <String, Expression> getPerModeInitialStates(Configuration c) or something similar that would convert from the internal representation to the current representation for initial or forbidden states (map of mode name to expression). This could be used by any printer or benchmark generator as needed.