eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
847 stars 350 forks source link

CycloneDDS and eProsima Micro XRCE-DDS Communication in ROS2 #2062

Open yigitboracagiran opened 1 month ago

yigitboracagiran commented 1 month ago

I'm currently working on a project on ROS2. I'm using STM32F446ZET6U on my controller circuit and ROS Humble in my computer. MicroROS (STM) is using eProsima Micro XRCE-DDS, Nav2 (ROS2-HUMBLE-NAV2) is using CycloneDDS. In default behavior when I send a goal with NAV2 and the goal becomes successful state, Nav2 should send a twist message which should stop to robot. And this becomes my problem: Sometimes stop message doesn't reach to the STM and it causes robot turning around itself.

eboasson commented 1 month ago

This is indeed curious. There are of course legitimate cases where this could happen ("best-effort" comes to mind), but it would be good to have some evidence of what is going on.

With this in mind, I would first check the QoS settings. If those seem fine, it would be good to have evidence that the stop message is indeed sent. From your description, it seems you could flag any cases where the stream of twist messages stops with ending on a stop message, and that is something you could check with an extra process that creates a subscriber that monitors the twist messages and prints something if it hasn't seen it when twist message stop coming in often.

I expect it is also possible to create a microROS application that does that and run it on the same machine that is running the XRCE agent.

None of that would solve it, of course, but I think it would help pinpoint where the message disappears. That should at least bring us a step closer to fixing it.

yigitboracagiran commented 1 month ago

Do you have any idea how to change or see Reliability of NAV2 or STM?

eboasson commented 1 month ago

I'd try the ros2 command-line tool, like what is suggested here: https://robotics.stackexchange.com/questions/100226/ros2-list-publishers-and-subscribers

There is also the cyclonedds ls --qos (see https://github.com/eclipse-cyclonedds/cyclonedds-python ), or its new GUI tool https://github.com/eclipse-cyclonedds/cyclonedds-insight (still in its infancy, there's a downloadable binary somewhere among the artifacts), and there also the possibility of writing trace files. But the ros2 CLI tool seems like the most practical starting point

yigitboracagiran commented 1 month ago

ros2 topic info -v /cmd_vel Type: geometry_msgs/msg/Twist

Publisher count: 1

Node name: remap_cmd Node namespace: / Topic type: geometry_msgs/msg/Twist Endpoint type: PUBLISHER GID: 01.10.ad.df.7a.b6.0d.65.ab.0b.02.10.00.00.15.03.00.00.00.00.00.00.00.00 QoS profile: Reliability: RELIABLE History (Depth): KEEP_LAST (10) Durability: VOLATILE Lifespan: Infinite Deadline: Infinite Liveliness: AUTOMATIC Liveliness lease duration: Infinite

Subscription count: 1

Node name: recu_node Node namespace: / Topic type: geometry_msgs/msg/Twist Endpoint type: SUBSCRIPTION GID: 01.0f.46.1c.5e.7d.8b.27.01.00.00.00.00.00.01.04.00.00.00.00.00.00.00.00 QoS profile: Reliability: RELIABLE History (Depth): KEEP_LAST (1) Durability: VOLATILE Lifespan: Infinite Deadline: Infinite Liveliness: AUTOMATIC Liveliness lease duration: Infinite

eboasson commented 1 month ago

On the assumption that it is the XRCE agent's subscription to reflect the microROS subscription (it is only now that I ask myself that question) this says it is reliable, but it also says only the last value is retained. That matters if they are written faster than they are taken.

Naturally, if the sequence of Twist messages ends on the stop command, then that one should get through. If some other Twist messages gets sent on the same topic immediately after the stop command, then the stop command could legitimately get lost.

It is virtually impossible that Cyclone doesn't publish it, but it would be best to completely eliminate the possibility that it doesn't. The best way is to look inside the XRCE agent process, but that is a tall order; the next best thing is to check what is being published. I am pretty sure that a regular ROS application would be enough to tell you that.

If you have Wireshark and can reproduce it while capturing network packets, then it is even better: you can then see the packets, and you can also see whether the XRCE agent acknowledges receipt or not. If it acknowledges it and yet the robot doesn't get it, then the problem is definitely on the microROS side; if it is never sent, it is probably on the Cyclone side; and else it is in this in-between state that everyone hates because it is no longer clear who is responsible ...

Or ... you can try increasing the history depth on the microROS side and see if the problem disappears 🙂

yigitboracagiran commented 1 month ago

Thank you so much, I'll try and get back to you