usdot-fhwa-stol / carma-platform

CARMA Platform is built on robot operating system (ROS) and utilizes open source software (OSS) that enables Cooperative Driving Automation (CDA) features to allow Automated Driving Systems to interact and cooperate with infrastructure and other vehicles through communication. Doxygen Source Code Documentation :
https://usdot-fhwa-stol.github.io/documentation/carma-platform/
404 stars 123 forks source link

Plugins still publishing to plugin_discovery despite being inactive #2384

Open MishkaMN opened 6 months ago

MishkaMN commented 6 months ago

Summary

During VRU validation testing, it was discovered that some strategic plugins that failed to get configured or activated still publishes its capability to the plugin_discovery topic. This makes arbitrator to continuously call those plugins and time out repeatedly. From log:

1713294401.7887328 [guidance_controller-42] 1713294401.788452671 | ERROR | get_managed_node_state:106 | Server time out while getting current state for node with name: /guidance/plugins/approaching_emergency_vehicle_plugin
1713294401.7888596 [guidance_controller-42] 1713294401.788509967 | ERROR | add_plugin:183 | Failed to configure newly discovered non-required plugin: /guidance/plugins/approaching_emergency_vehicle_plugin Marking as deactivated and unavailable!

Yet, the capability is available:

1713294469.7516770 [carma_component_container_mt-25] 1713294469.751214442 | INFO | get_topics_for_capability:48 | Received Topics: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, /guidance/plugins/route_following_plugin/plan_maneuvers, /guidance/plugins/stop_and_dwell_strategic_plugin/plan_maneuvers, /guidance/plugins/sci_strategic_plugin/plan_maneuvers, /guidance/plugins/lci_strategic_plugin/plan_maneuvers, 
1713294469.7517283 [carma_component_container_mt-25] 

And this results in repeated time-outs (although it is also a design flaw to try 10 times with 0.5 timeout for strategic plugins):

guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 0
1713294495.8355138 [carma_component_container_mt-25] 1713294495.835187321 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 1
1713294496.3357017 [carma_component_container_mt-25] 1713294496.335434161 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 2
1713294496.8359416 [carma_component_container_mt-25] 1713294496.835693847 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 3
1713294497.3361895 [carma_component_container_mt-25] 1713294497.335941837 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 4
1713294497.8365588 [carma_component_container_mt-25] 1713294497.836221242 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 5
1713294498.3368809 [carma_component_container_mt-25] 1713294498.336513135 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 6
1713294498.8371410 [carma_component_container_mt-25] 1713294498.836800445 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 7
1713294499.3374329 [carma_component_container_mt-25] 1713294499.337086187 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 8
1713294499.8377602 [carma_component_container_mt-25] 1713294499.837396271 | WARN | multiplex_service_call_for_capability:86 | Following client timed out: /guidance/plugins/approaching_emergency_vehicle_plugin/plan_maneuvers, retrying, attempt no: 9

Since the capability is published from base classes, this type of issue probably exist for tactical and control plugins as well, which needs to be verified.

Version

4.5.0 (Current)

Expected Behavior

Inactive plugins should not publish to discovery

Actual Behavior

See above.

Steps to Reproduce the Actual Behavior

Intentionally make some plugins inactive and engage carma-platform to see arbitrator still calling those plugins

Related Work

Jira: https://usdot-carma.atlassian.net/browse/CAR-6040 No response

MishkaMN commented 2 months ago

This issue occurs because plugin manager in subsystem controller depends on plugin_discovery topic to manage whether if a plugin is active or not. However, when a plugin is deactivated, its publisher gets deactivated too, so the plugin_discovery topic never receives the signal that it was deactivated. In order to handle this situation, 1) ros2_lifecycle base class should wait for its deriving nodes to finish its derived functions (handle_on_deactivate or on_plugin_deactivate etc) before deactivating publishers. This is so that the derived classes can publish its deactivated status before base class deactivates the publishers. 2) on each state transition of the plugins, they should publish its plugin status (active or not and available or not) to the discovery topic. 3) since a node can segfault and just exit, its lifecycle transition may not get picked up on the plugin_discovery topic. In that case, perhaps plugin_manager should also monitor outdated plugins and prune. We should still keep 2. so that we can get the status update immediately without waiting to prune with this 3. logic.