zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
11k stars 6.69k forks source link

RFC: State Machine Framework: Breaking behavior change to align closer to the UML specification for State Machines #71675

Closed glenn-andrews closed 6 months ago

glenn-andrews commented 7 months ago

Introduction

The State Machine Framework (SMF) is a unique implementation of a Hierarchical State Machine, where an industry standard exists. I propose we make it align more closely with the UML standard for State Machines to prevent bugs caused by users who are familiar with UML semantics not understanding the caveats required for SMF.

Problem description

SMF does not 'correctly' handle transitions called from a parent state, since the 'context' for the transition is always the child state that ran before the parents were called. See: https://github.com/zephyrproject-rtos/zephyr/issues/66341

State machines designed using UML notation may not work if directly implemented in SMF with the expectation that a parent state initiating a transition will call all the exit actions up to that state, and back down to the target state. The calling of exit and entry actions will be 'short circuited' if the child state and target state have a least common ancestor who is a descendant of the parent state initiating the transition.

On a mere personal note, I dislike the way SMF was designed to terminate by calling smf_set_state() with a new_state of NULL, which calls all exit actions up to the topmost parent (which presumably calls smf_set_terminate()) This feels both an abuse of smf_set_state() and a potential bug since if the topmost state's exit action does not call smf_set_terminate() the state machine will remain in the child state. The normal UML way is a specified final state you transition to explicitly.

Proposed change

  1. Unlike my proposal to add UML-style transitions to SMF (https://github.com/zephyrproject-rtos/zephyr/pull/70914), this proposal completely replaces the current transition mechanism with UML-style transitions.
  2. Change SMF_CREATE_STATE() to always accept the same number of parameters (i.e. adopt https://github.com/zephyrproject-rtos/zephyr/issues/71252) so both flat and hierarchical state machines can be used in the same project.
  3. Make transitioning to a NULL state illegal.

Detailed RFC

The basic change of 'classic' to 'UML-style' state transitions is determining the Least Common Ancestor (LCA) of the state ceiling smf_set_state() and the target of smf_set_state(). 'Classic' SMF will always calculate the LCA of the current state variable in the context, which will not be the same state, if a parent state's run action calls smf_set_state()

https://github.com/zephyrproject-rtos/zephyr/pull/70914 details an optional UML-mode for SMF. This change would replace smf_set_state() with the UML version, rather than allowing users to select UML or 'classic'.

SMF_INITIAL_TRANSITION would be removed, as UML state machines always support initial transitions, and if SMF_CREATE_STATE() always has the same number of parameters, the initial parameter will always be present and may as well be available.

Passing NULL as the new_state parameter of smf_set_state() shall log an error (or possibly just cause an access violation. TBD)

Proposed change (Detailed)

Please see https://github.com/zephyrproject-rtos/zephyr/pull/71729 for the proposed implementation of UML-style transitions by adding an executing state pointer, and using that to calculate the LCA instead of current.

Dependencies

USB-C and Hawkbit are the only Zephyr component that relies on SMF (so far) and would have to be tested to ensure changing the framework does not cause failures there.

Concerns and Unresolved Questions

I don't know how this will affect application developers. I doubt anyone is using all the advantages of HSMs because of the limited scope of the existing framework. Basic transitions from child states to other children still work identically, it's only when you start doing transitions from parent states do the deficiencies become clear.

Alternatives

The obvious alternative is making UML-style optional as in https://github.com/zephyrproject-rtos/zephyr/pull/70914

I don't think 'classic-style' offers enough advantages to developers to leave it around, and for people looking for traditional UML transition mechanics, it will come as a surprise.

glenn-andrews commented 7 months ago

@sambhurst @keith-zephyr

keith-zephyr commented 7 months ago

I'm in favor of aligning with the UML specification.

Besides the change to prohibiting calling smf_set_state() with NULL, what are other behaviors that will change? It's important that we clearly document a migration plan for any downstream users of the SMF.

glenn-andrews commented 7 months ago

Besides the change to prohibiting calling smf_set_state() with NULL, what are other behaviors that will change? It's important that we clearly document a migration plan for any downstream users of the SMF.

The other major change is that you can't call smf_set_state() in an entry action. This was a decision on my part to make the 'context' for a state transition clearly defined - the run action of either the current state or one of its direct parents. Transitioning in an exit action was already banned.

I can see a use case for being able to transition in an entry action: If you're trying to decide which of a number of sub-states is your transition destination, you could transition to the common parent state then have a choice in the entry action to transition to the child state.

I should be able to re-enable transitions in entry actions by setting the 'executing' state to the entry actions as they are called.

glenn-andrews commented 7 months ago

Updated in https://github.com/zephyrproject-rtos/zephyr/pull/71729

smf_set_state() can now be called from entry actions.

The only changes now are the operation of entry/exit actions to match UML transition semantics and getting rid of passing in NULL to smf_set_state() to execute all exit actions.

carlescufi commented 6 months ago

Arch WG:

Conclusion:

henrikbrixandersen commented 6 months ago

@glenn-andrews Can this be closed now?

glenn-andrews commented 6 months ago

@glenn-andrews Can this be closed now?

Yes, it can.

As an aside, SMF is agnostic to how the user runs the state machine or passes events to be processed. A separate framework must be created to handle this. Good state machine design often needs add-on features like posting reminder events to yourself, or deferring handling of an event until you are out of a given state, all of which is provided by the framework.

Would there be any appetite in a standardized framework separate from SMF that people can use or can ignore and write their own as was done for USB-PD?

glenn-andrews commented 6 months ago

Completed with merging of https://github.com/zephyrproject-rtos/zephyr/issues/71252