[REP-0155] Use 'HRI-stamped' messages instead of subtopics

severin-lemaignan commented 2 years ago

Overview

Replace the 'one topic namespace by face/body/voice/person' approach by 'one subtopic + messages with HRI headers' design.

For instance, instead of: /humans/faces/face_abc/cropped, a message is published under /humans/faces/cropped, of type hri_msgs/ImageHRI that contains the face ID.

Context

In the thread of #338, @cst0 made the following comment, that this PR addresses:

Providing a unique subtopic per-human is something I'm not a huge fan of. This one is a bit bigger.

I think I see why there's a unique subtopic per person: as the REP observes there needs to be a unique identifier per person. Each identifier is a likely a unique node, which again makes sense to me (different data sources, different goals). So those two things combined force you to split the identification results to unique topics which are grouped by person; a "Human.msg" that encapsulates all that data isn't ideal because that would require some centralized aggregator node. That's not terrible, but it's not ideal and would restrict more distributed systems, and so subtopics it is.

There's a few problems the multi-subtopic approach presents, in my opinion. First, when is it acceptable to introduce or remove a new topic? Identification processes will have some uncertainty, so it's not uncommon for a recognizer to "flicker". Surely we don't want to bring up and take down a topic with every image message we mis-identify someone with, but then we don't want to leave them up indefinitely either. With each human, the RFC discusses as many as 9 persons subtopics, 4 audio subtopics, 6 bodies subtopics, and 8 faces subtopics, with room for more if more identifiers are presented. So are we potentially having implementations where we're dynamically constructing 25+ new topics per person? I don't love that in settings where I know there are only a handful of people being recognized, and I predict it being prohibitively difficult to interact with in (for example) the mall setting where a single time step may have 30+ people. Each of those is another publisher and subscriber to spin up, and will also present developer-side difficulties in introspecting on node/graph behavior. Developer-side difficulty presents a fairly major hurdle to adoption in my view.

The good news is, I think both these concerns could be resolved by introducing a "human header" message and grouping topics into streams of per-person messages. As a result, I think the current approach could be adapted to have a topic per identifier, where the topic is responsible for publishing a stream of observations that are "tagged" by person.

For example, the current approach of

/humans/bodies/alice/skeleton2d # containing points of alice's skeleton
/humans/bodies/bob/skeleton2d   # containing points of bob's skeleton
/humans/faces/alice/roi         # containing alice's face's ROI
/humans/faces/alice/expression  # containing alice's expression
/humans/faces/bob/roi           # containing bob's face's ROI
/humans/faces/bob/expression    # containing bob's expression
...

could become:

/humans/bodies/skeletons2d   # a msg of points of alice's skeleton, then a msg of bob's
/humans/faces/rois           # a msg of alice's face's ROI, then a msg of bob's
/humans/faces/expressions    # a msg of alice's expression, then a msg of bob's

Per @wjwwood's earlier advice on the Header message, frame_id's aren't necessary for these messages (and the seq probably isn't either). But, relating a unit of information to a person and a time is valuable, and so a message encapsulating this can be put on each message. Other parts of the RFC relate people to frame_ids and so that information becomes redundant.

Potential limitations to be discussed

This approach relies on new/additional hri_msgs that can be found here: https://github.com/ros4hri/hri_msgs/tree/hri_headers/msg

The main difficulties I can think of are:

need to re-create 'HRI-stamped' standard messages, like BoolHRI, StringHRI, JointStateHRI because the new messages are not standard, need additional work for some tool to work. For instance, the TF frames of humans were published by running a robot_state_publisher for each /humans/bodies/<id>/joint_states. We now have /humans/bodies/joint_states publishing JointStateHRI, which can not be directly consumed anymore by robot_state_publisher. Need to either create + insert a republish_hri node to republish 'pure' JointState msg (not so nice, and bring us back to the previous 'one topic per ID' situation), or we need to implement a hri_state_publisher that will share 99% of the code with robot_state_publisher (might still do that, as we would have only one process for all the humans, instead of one process per detected skeleton)
another example is cropped images from face, body: they can not be displayed directly anymore in eg rviz, rqt_image_view...

@clalancette @cst0: any comments?

severin-lemaignan commented 2 years ago

The PR for hri_msgs with the new headers is here: https://github.com/ros4hri/hri_msgs/pull/3

@wjwwood any comments about this new approach? especially on the format of the new hri_msgs/IdHeader.msg

severin-lemaignan commented 2 years ago

@clalancette @wjwwood -- could you have a look? I was initially hoping to merge this PR before the main REP-155 draft would be merged to ros-infrastructure as it is a fairly important design change.

It would be really good to decide now if we keep the 'one subtopic per detected face/body/voice/person' approach (current REP-155), or if we switch to a model with a single topic per 'human feature' (eg humans/faces/landmarks) and stamp each message with an unique ID (hri-headers based approach).

It would be helpful to decide now, so that I can present the 'right' version during ROSCon in October.

Thanks!

victorpaleologue commented 1 year ago

As far as I understood, the reason for this change is that: a) the perception nodes are responsible for the lifecycle of the transient data the client has no control over it b) it is annoying to build and drop subscribers for all of these subtopics on the fly

But if we do that change: c) we leave the burden to manage the lifecycle of the transient data to the client d) it will not be usable with RViz

My opinion on these points: a) is a strong theoretical argument IMO. The client (the node running the person manager) may have different policies about the validity of the data, and different timeouts depending on their use cases. b) can be mitigated with helper libraries, even though these are extra complications. c) can be mitigated with helper libraries, code snippets or conventions. d) can be mitigated with extra nodes to produce well-formatted topics for RViz. A custom person manager could still produce per-person topics in addition to the other topics, and forward the desired pieces of data.

So the proposal seems relevant and achievable. However did you consider a compromise: handling the transient data with HRI-stamped messages, and non-transient ones (such as persons) with per-object topics?

severin-lemaignan commented 1 year ago

@victorpaleologue thanks for the feedback! much appreciated!

To add to your points:

a) one key design choice in ROS4HRI is that clients should not rely on a face/body/voice id to be meaningful once the corresponding face/body/voice is not detected anymore (eg, these ID should be discarded). As such, it is better if the client is not allowed to make (potentially wrong) assumption about the lifecycle of this kind of data

b) it is already mitigated by libhri/pyhri. Client do not have to deal themselves with that if they do not want to.

c) agree with you

d) it is a bigger issue IMO: if you need to create a node that republish everything as separate topic, the whole system becomes a mess and all the data is duplicated, potentially causing a lot of confusion.

Regarding your suggested compromise: actually, persons are not generally persistent because of the concept of anonymous persons that are created and deleted on the fly when an actual person is not yet associated to a feature. And the additional complexity of having to maintain and explain two systems seems a bit scary to me! :-)

victorpaleologue commented 1 year ago

Ok, a) might be key to the decision so let's dig into it.

Regardless how the data laid out, the ID expresses some continuity of existence. As you said, only the node providing the data knows when that existence is over. Does the person manager need to know when the existence is over? Does it need to differentiate between the lack of data and the assurance that the thing tracked disappeared? If yes, how would it be communicated, if not by the disappearance of a subtopic?

severin-lemaignan commented 1 year ago

As per the REP, the topics /humans/*/tracked (eg /humans/faces/tracked) are published by the perception nodes and should reflect whether the corresponding features are currently 'tracked'. The semantic of 'tracked' is availability of data (eg, active perception). Existence is implicit, and I'm not so sure what would be the semantic of 'existence is over': a face does not 'cease to exist' in absolute terms. However, it 'cease to exist' for the face detector when the face is not detected anymore...

In any case, the 'normal' way to know if an ID has disappeared is to monitor the /humans/*/tracked topics, not to monitor whether a subtopic disappears. This is also how things are implemented in libhri/pyhri.

victorpaleologue commented 1 year ago

Yes, I meant for the face detector when the face is not detected anymore. If the /humans/*/tracked provide this information, then the impact a) is not an issue. The preception nodes are responsible for updating /humans/*/tracked, and the client has no control over it anyways. I was wrong thinking that would change anything.

Then d) is the next-biggest impact, and it plays in favor of rejecting the proposal. Unless someone thinks the impact b) is too important.

clalancette commented 1 year ago

@severin-lemaignan Just taking a look at this, is there working code that implements this new design? It would be helpful for us to take a look at how this works in practice. Thanks.

ros-infrastructure / rep