Navigation Goal Generation from Action Recognition

This task is related to tracking the further integration of the Perception and Navigation stacks.

Initially, the Navigation stack is suppressing the Perception BT when a manual goal is received, then reactivates the BT when the navigation task is completed. We want to automatize this process, by making the Perception BT giving the goal to the navigation stack, having thus a continuous switch between different behaviors. To do this, the perception BT exposes on a YARP port the pose of the human, then a ROS2 node takes this information and find a suitable place on the map for the goal. In other words: the goal will be placed as close as possible to the human, in an obstacle free area, taking into account the robot dimensions and, if possible, in front of the human.

There are three possible scenarios on how to implement this, divided by a degree of complexity:

1 : It's the easiest to realize: Action recognition of "Come Here" -> Perception BT disable -> Navigation to goal -> BT enabling on Navigation termination -> start again

This scenario implies that there is only one person on the scene, it's seen by the robot, and once the robot starts navigating, the perception and goal generation are disabled. So this means that the goal will remain fixed in the map, and the robot will not be able to react if a person moves or ask it to stop.

2 : Reactive Goal Generation (more complicated): Action recognition of "Come Here" -> only the manipulation part - i.e. the robot gestures - of the BT are disabled -> Navigation with action recognition and gaze controller active -> On Navigation termination the perception BT is completely restored and it's ready to restart again.

In this case we still have only one person in the scenery, to avoid suddens switches in the focus/attention. However, we also introduce another action "Stop" and the possibility to update the goal while also navigating. However, since the gaze controller will be enabled, the navigation will not be able to look around for obstacles.

3 : Reactive Goal Generation with Agent Memory (hard to do): Is the same as (2) but we recognize the human that made the action to navigate there. In this case we have more people in the scene, but the others are ignored and we focus only on one person.

We decided to start with the (1) scenario and start evaluating the goal generation and the overall integration. Actions to do: @steb6

[x] Evaluate the pose estimation in the camera frame
[x] Publish this the data on a YARPport in a BT node

@SimoneMic

[x] Create a ROS2 node to read the human port from a YARP port and figure out how to place the goal on the map

In the end, all of this should be integrated in a unique Behavior node placed in the Perception BT that publishes the goal on the /goal topic. This will automatically kickstart the Navigation.

hsp-iit / project-ergocub

Navigation Goal Generation from Action Recognition #165