Closed jginesclavero closed 3 years ago
Hi again @norro!
Yesterday, I had a meeting with @chcorbato , and we talked about the case where a lifecycle node transits to
ErrorProcessing
. Following the documentation and the lifecycle node diagrams, if a node has an error it transits toErrorProcessing
. Then, based on this processing result, it can go to theFinalized
state orUnconfigured
state. Do you think that the system_modes must manage the unconfigured state of the lifecycle nodes? This management covers this situation and the start-up situation, where the nodes are in the unconfigured state.Thank you!
This is in the context of our exemplary case of the laser_driver
error. We want to elaborate on the layered approach we discussed in the last MROS meeting. This is how I interpret our desired design (please comment if something is not correct or clear):
laser_driver
code for handling errors tries to recover from the error in the ErrorProcessing
transition state.(from here it is a related but different issue #48)
Active
), the ModeManager
tries to recover from the error using the feature/rules
. For this, @jginesclavero is adding a rule in the SystemModes file of our system.MODE(s)
of the laser_driver
are not reached either, the ModeManager
reports to the Metacontroller
that the corresponding (sub)system(s) MODE(s) are not reachable.
(see issue for the continuation of the handling of errors at the higher layers)I agree with 1. and 2. However, the mode manager will not actively report that a certain mode is not available. With https://github.com/micro-ROS/system_modes/issues/43, however, it will be possible for the meta control to get the information, which modes are available.
This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.
This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.
Very good point indeed, so far we are not accounting for timing issues. How do we include timing constraints for node management? These could be considered metacontrol requirements for the robotic application:
This implies the incorporation of some timestamping and temporal [interval] reasoning. We can incorporate some concepts from e.g. UML2 or UML MARTE.
However, the mode manager will not actively report that a certain mode is not available. With #43, however, it will be possible for the meta control to get the information, which modes are available.
I agree. So the current design proposal is that Mode Manager just inform about available and reachable modes, and Metacontrol is responsible for inferring from that about the success of reconfiguration actions. See below for how to model that reasoning.
This is also a question of timing for the following reason: Any state/mode transition will take some time (miliseconds to seconds, maybe), even in the normal, non-failure case. So it is not entirely clear, when someone (the mode manager? metacontrol?) should decide, that a transition or rule didn't work out and other actions have to be taken. I think this kind of decision, how long to wait for a node to recover or a rule to take effect, is best placed in the metacontrol, since this is probably task-specific.
Very good point indeed, so far we are not accounting for timing issues.
How do we include timing constraints for node management? These could be considered metacontrol requirements for the robotic application:
- How should these requirements be defined? Language, relation to MROS metamodel @darkobozhinoski and ontology @rsanz @estherag
This implies the incorporation of some timestamping and temporal [interval] reasoning. We can incorporate some concepts from e.g. UML2 or UML MARTE.
@rsanz can you point to the specific concepts? I think we need to specify some modelling requirements (see below) to evaluate which concepts we need.
- Where should they be defined? I think we should have a discussion about this on the next meeting @darkobozhinoski, ideally with the input of all ROS developers/architects in MROS @gavanderhoorn @marioney @wasowski @fmrico @jginesclavero @lbajo @ralph-lange
We could provide this information in the MROS model of the system (Darko's metamodel) as we are doing with the QAs, but I think it is more related to the specific software components that to the application logic.
We could define default values in the MRSO metacontroller to assume when no info is provided. E.g. assume node mode change takes up to 2secs, and subsystem mode change can take up to 5secs @jginesclavero @lbajo @marioney @fmrico what numbers are reasonable for navigation2 nodes?
Hi again @norro!
Yesterday, I had a meeting with @chcorbato , and we talked about the case where a lifecycle node transits to
ErrorProcessing
. Following the documentation and the lifecycle node diagrams, if a node has an error it transits toErrorProcessing
. Then, based on this processing result, it can go to theFinalized
state orUnconfigured
state. Do you think that the system_modes must manage the unconfigured state of the lifecycle nodes? This management covers this situation and the start-up situation, where the nodes are in the unconfigured state.Thank you!