FlexBE / flexbe_app

The classic user interface (editor + runtime control) for the FlexBE behavior engine. See the flexbe_webui for latest
BSD 3-Clause "New" or "Revised" License
48 stars 49 forks source link

ConcurrencyContainer not working on branch "ros2-devel"? #88

Open tiko5000 opened 1 year ago

tiko5000 commented 1 year ago

I try to implement a simple Statemachine with a single concurrency container, but it fails to execute:

 Start
  |
  \/
 Container
  |
  ----------------------------
  |                          |
  \/                         \/
  LOG_A                      LOG_B
  |                          |
  \/                         \/
  finshed                    failed

This is the statemachine implementation:

concurrency_test_sm.py:


#!/usr/bin/env python
# -*- coding: utf-8 -*-
###########################################################
#               WARNING: Generated code!                  #
#              **************************                 #
# Manual changes may get lost if file is generated again. #
# Only code inside the [MANUAL] tags will be kept.        #
###########################################################

from flexbe_core import Behavior, Autonomy, OperatableStateMachine, ConcurrencyContainer, PriorityContainer, Logger
from flexbe_states.log_state import LogState
# Additional imports can be added inside the following tags
# [MANUAL_IMPORT]

# [/MANUAL_IMPORT]

'''
Created on Wed Jun 28 2023
@author: concurrency_test
'''
class concurrency_testSM(Behavior):
    '''
    concurrency_test
    '''

    def __init__(self, node):
        super(concurrency_testSM, self).__init__()
        self.name = 'concurrency_test'

        # parameters of this behavior

        # references to used behaviors
        OperatableStateMachine.initialize_ros(node)
        ConcurrencyContainer.initialize_ros(node)
        PriorityContainer.initialize_ros(node)
        Logger.initialize(node)
        LogState.initialize_ros(node)

        # Additional initialization code can be added inside the following tags
        # [MANUAL_INIT]

        # [/MANUAL_INIT]

        # Behavior comments:

    def create(self):
        # x:30 y:365, x:130 y:365
        _state_machine = OperatableStateMachine(outcomes=['finished', 'failed'])

        # Additional creation code can be added inside the following tags
        # [MANUAL_CREATE]

        # [/MANUAL_CREATE]

        # x:30 y:365, x:130 y:365, x:230 y:365, x:330 y:365, x:430 y:365
        _sm_container_0 = ConcurrencyContainer(outcomes=['finished', 'failed'], conditions=[
                                        ('finished', [('A', 'done')]),
                                        ('failed', [('B', 'done')])
                                        ])

        with _sm_container_0:
            # x:165 y:142
            OperatableStateMachine.add('A',
                                        LogState(text="A", severity=Logger.REPORT_HINT),
                                        transitions={'done': 'finished'},
                                        autonomy={'done': Autonomy.Low})

            # x:308 y:137
            OperatableStateMachine.add('B',
                                        LogState(text="B", severity=Logger.REPORT_HINT),
                                        transitions={'done': 'failed'},
                                        autonomy={'done': Autonomy.Low})

        with _state_machine:
            # x:241 y:89
            OperatableStateMachine.add('Container',
                                        _sm_container_0,
                                        transitions={'finished': 'finished', 'failed': 'failed'},
                                        autonomy={'finished': Autonomy.Inherit, 'failed': Autonomy.Inherit})

        return _state_machine

    # Private functions can be added inside the following tags
    # [MANUAL_FUNC]

    # [/MANUAL_FUNC]

When executed with "Block transitions which require at least "Low" autonomy the

Console output is:

12:20:41 PM] Onboard engine just started.
[12:20:46 PM] --> Preparing new behavior...
[12:20:46 PM] BE Starting [concurrency_test : 1208022996]
[12:20:46 PM] A
[12:20:46 PM] B
[12:20:46 PM] ConcurrencyContainer Container returning outcome failed (request inner sync)
[12:20:46 PM] Behavior execution for concurrency_test: 1208022996 failed! [-]
    exceptions must derive from BaseException
[12:20:46 PM] Traceback (most recent call last): [+]
[12:20:46 PM] No behavior active.
[12:20:46 PM] Onboard engine just started.
[12:20:46 PM] --- Behavior Engine finished - ready for more! ---
[12:20:52 PM] Onboard engine just started.

A and B are printed, which is fine. I would expect to see A and B in the "Behavior Execution" in the "Runtime Control". There I would expect to be able to select "done" outcome from either A or B. But the Behavior finished by itself, without waiting for Operator Input.

Am I missing something?

dcconner commented 1 year ago

I just pushed a major update to flexbe ros2-devel before I saw this. I'll try to verify this test tomorrow, but in the meantime, I'd love for you to give the new version a try. Check there change logs as there are significant changes. The new version seems more stable, maintains sync better, and uses less CPU resources.

tiko5000 commented 1 year ago

Thanks for the notice. I gave it a try but still encounter some unexpected behavior. Here are my Testcases:

  1. Simple Concurrency Container with 2 Log-States - Outcome is A(done) || B(done)

Bildschirmfoto vom 2023-06-29 09-04-16 Bildschirmfoto vom 2023-06-29 09-09-37

Log:

[7:09:33 AM] Onboard engine just started.
[7:09:43 AM] --> Mirror - received updated structure
[7:09:43 AM] --> Preparing new behavior...
[7:09:43 AM] Received a new mirror structure for checksum 1383782741
[7:09:43 AM] BE Starting [Concurrency_Test : 1383782741]
[7:09:43 AM] A
[7:09:43 AM] B
[7:09:43 AM] ConcurrencyContainer Container returning outcome finished (request inner sync)
[7:09:43 AM] Behavior execution for Concurrency_Test: 1383782741 failed! [-]
    exceptions must derive from BaseException
[7:09:43 AM] No behavior active.
[7:09:43 AM] Onboard engine just started.
[7:09:43 AM] Traceback (most recent call last): [+]
[7:09:43 AM] ␛[92m--- Behavior Engine finished - ready for more! ---␛[0m
[7:09:43 AM] Mirror built for checksum 1383782741.
[7:09:43 AM] Executing mirror...
[7:09:45 AM] Onboard engine just started.
[7:09:45 AM] Onboard engine just started, stopping currently running mirror.
[7:09:45 AM] Mirror finished with result preempted
[7:09:45 AM] ␛[92m--- Behavior Mirror ready! ---␛[0m
[7:09:56 AM] Onboard engine just started.
  1. Simple Concurrency Container with 2 Log-States - Outcome is A(done) && B(done)
    • Is this legit at all? If A is done B will preempted anyways right? So the Concurrency Container will never have a proper outcome "finished"?

Bildschirmfoto vom 2023-06-29 09-04-16 Bildschirmfoto vom 2023-06-29 09-06-28 Bildschirmfoto vom 2023-06-29 09-12-57

Log:

[7:06:52 AM] ␛[92m--- Behavior Mirror ready! ---␛[0m
[7:06:52 AM] Onboard engine just started.
[7:07:03 AM] Onboard engine just started.
[7:07:14 AM] Onboard engine just started.
[7:07:25 AM] Onboard engine just started.
[7:07:35 AM] --> Preparing new behavior...
[7:07:35 AM] --> Mirror - received updated structure
[7:07:35 AM] Received a new mirror structure for checksum 1708774436
[7:07:35 AM] BE Starting [Concurrency_Test : 1708774436]
[7:07:35 AM] A
[7:07:35 AM] B
[7:07:35 AM] Mirror built for checksum 1708774436.
[7:07:35 AM] Executing mirror...
[7:07:36 AM] OCS is possibly out of sync - onboard state is /Container/B [-]
        Check UI and consider manual re-sync!
        (mismatch may be temporarily understandable for rapidly changing outcomes) 1
[...]
[7:07:43 AM] OCS is possibly out of sync - onboard state is /Container/B [-]
        Check UI and consider manual re-sync!
        (mismatch may be temporarily understandable for rapidly changing outcomes) 1
[7:07:43 AM] ConcurrencyContainer Container returning outcome finished (request inner sync)
[7:07:43 AM] Behavior execution for Concurrency_Test: 1708774436 failed! [-]
    exceptions must derive from BaseException
[7:07:43 AM] No behavior active.
[7:07:43 AM] Onboard engine just started.
[7:07:43 AM] Traceback (most recent call last): [+]
[7:07:43 AM] ␛[92m--- Behavior Engine finished - ready for more! ---␛[0m
[7:07:43 AM] Onboard behavior failed!
[7:07:43 AM] Mirror finished with result preempted
[7:07:43 AM] ␛[92m--- Behavior Mirror ready! ---␛[0m
[7:07:43 AM] No onboard behavior is active.
dcconner commented 1 year ago

A couple of notes, then I'll put together an example for tutorial. I think this is normal and expected behaviors

You have "autonomy low", which is typical of log states. That means they finish and move to next state. Both A & B finish after one execution and return, which causes the concurrency to return immediately. Because you have the output of concurrency tied to statemachine finished, the behavior is done. Because both are log states, they both return after one execution call, so it doesn't matter if || or &&.

A limitation of the current version of FlexBE UI is that it only shows one of the active states in concurrency.

Try changing the required autonomy level of output so that it will pause.

The failed and "BaseException" issue is unexpected, and I'll be looking in to that today.

dcconner commented 1 year ago

There seems to be issue with exiting concurrency container and exiting behavior immediately causing exception.
If I add a log state after the concurrency container it not longer gives the exception.

There also seems to be issue with blocking transitions inside the concurrency, so I'll need to look into that more.

Thanks for reporting.

There is also the known issue of FlexBE UI only showing one state inside the concurrency container.

tiko5000 commented 1 year ago

Ok great, thanks for the fast handling of the issue.

I already set the required autonomy level of the states inside the concurrency container to high and started the behavior with Block transitions which require at least 'High' autonomy.

But only if the outcome of the concurrency container is A(done) && B(done), the outcome of A can be forced in the Runtime Control. If the outcome of the concurrency container is A(done) || B(done)the concurrency container still finishes immediately, even if there is state added after the concurrency container.

dcconner commented 1 year ago

I have spent a bit of time looking at the internals of how flexbe handled concurrency containers, and issues with sync I saw on ROS 2.

I have tested a significant modification to FlexBE and posted as ros2-pre-release branches on both flexbe app and flexbe_behavior engine

These two must be used consistently as they do require an API change.

See relevant change logs

I also have developed and introduces a new https://github.com/FlexBE/flexbe_turtlesim_demo release with several detailed examples related to concurrency containers. Specifically, see Examples 3 and 4.

A brief discussion of changes follows. I would appreciate any testing and feedback of these changes. There are still some clean up to do on them, but barring objections I plan to introduce these changes into an Iron release this fall.

The old approach, only set the "current state" as the initial first state in a concurrency container. This would still show as active even if finished and another state was active.

The new approach introduces a "state id" hash code for every state using a masked 23-bit hash code. This hash code is known to both onboard and mirror side. The lower 8-bits are set to the outcome (allows 255 outcomes on a state which is likely way more than anyone needs, but until we clearly need more than 23-bits to encode state id I chose to use 8-bits for outcome mapping.

Instead of reporting only the outcome changes, the new system reports an array of "current active states" for sync, and each outcome encodes both the outcome and state id using a 32-bit value.

This requires a slight increase in bandwidth, but I judge the reliability increases worthwhile.

The new approach reports returns from individual states and containers to help keep the mirror consistent and identify sync issues and recovery.

If an internal state returns, but another remains active the FlexBE UI will change. It currently shows the "deepest" active state. Currently only that state can be preempted, but with new changes we expect to support operator preemption at any level. As part of these changes, the OperatableStateMachine is now a pseudo manually transitionable state. This is a temporary hack during development. Long term, we will introduce a new ManuallyTransitionableStateMachine to mimic the state hierarchy.

Please test the new branches with your system any the Turtlesim tutorials mentioned above, and give me any feedback on the performance

@pschillinger

pschillinger commented 1 year ago

Thanks for the pointer @dcconner! First as a disclaimer, I'm not yet familiar with every single technical detail of the ros2-devel and ros2-pre-release branches, so I might need to revise or refine during the next days what I say now.

What I can say regarding the way transitions worked in concurrency containers so far is that the transition behavior as described above is indeed as expected, even though admittedly not most intuitive. I would mainly attribute this to an initial design limitation on my side, or in other words, FlexBE did not include concurrency initially and the concurrent execution of states was added on top under the constraint (mainly dictated by the API between the engine and the GUI) that there is always a single active state to be operated.

What this means is briefly summarized in the tutorial on Parallel State Execution:

Nevertheless, there is always one main state in a concurrency container, indicated by the same notation as the initial state of a state machine, which works as described in the next section. In general, any of the states can be set to the main state. [...] During execution, the main state of a concurrency container is monitored in the GUI as known from state machines. If this state is a state machine itself, outcomes of inner states can be forced or blocked by the selected autonomy level as usual. All other states not being the main state are running in the background. Their state of execution is not monitored, even if they are state machines. Consequently, they have no knowledge about the autonomy level and cannot be controlled manually. This might change in the future, but for now, this is how it works.

What this implies in consequence is, as observed in the initial example, that the outcomes of background states are not blocked by the autonomy level and might return immediately if the respective state dictates so. At least this is the expected part. Where it gets messier now is that, due to the fact that background states are not aware of the GUI, background states won't send transition notifications to the GUI, thus the GUI has to assume it might have gotten out of sync whenever a concurrency container returns an outcome (i.e., this happens when the CC outcome was triggered by a background state). This is also related to the observed warning of being potentially out of sync, a monitoring done by the behavior mirror to be precise but resulting from this fact.

Long story short, an improved mechanism to handle outcomes as proposed by @dcconner sounds promising to me and might be designed with a more intuitive handling of shared autonomy in the context of concurrency. I will need to do some testing myself to support with more details, though. I hope I can allocate some time for this next weekend.

dcconner commented 1 year ago

There is now a rolling-pre-release branch for flexbe_behavior_engine that has rebased from latest humble and rolling releases, and added some additional features. I'm going to leave the ros2-pre-release as is for now, but rolling-pre-release branch is the preferred branch for testing now. Still use ros2-pre-release for the flexbe_app for now.

dcconner commented 7 months ago

The iron, rolling, and ros2-devel branches have the concurrency container and state id changes. Please use those branches . For consistency you need version 4.0+ of the UI and 3.0+ of the flexbe_behavior_engine