Open rhaschke opened 1 year ago
@DavidPL1, you mentioned a segfault when introducing this, right? I didn't observe anything so far. I only had a naive implementation and probably had a stupid mistake that I had no time to debug.
Your implementation looks good, I only have one remark: should we not remove this AsyncSpinner here? https://github.com/ubi-agni/mujoco_ros_pkgs/blob/819b5082b8b143e76d8e78121bd2009c846f6479/mujoco_ros/src/main.cpp#L43-L44
However, I was sure I could remove the 100ms wait time on reset. But I can't. Actually, debugging showed that no ROS messages are processed by the
processAllRosMessages()
calls there.
I expected as much. Since the reset is handled in the event loop where sim_mtx
is locked, no new message should be created and the spinner threads should have enough time to process all outgoing messages until the actual reset is issued.
I think the reason is that an old
joint_states
message needs to "arrive" at the remoterobot_state_publisher
, which then sends back (old)TF
messages. And this simply takes time, we cannot influence.
I agree. The time it takes is also heavily influenced by the hardware the sim runs on and might still happen with the current timeout of 0.1 seconds. We could make this timeout configurable with a rosparam.
We could also listen for TF messages "from the future" and trigger a reset automatically, but I don't know if this is a practical solution.
Should we not remove this AsyncSpinner here?
I tried, but this didn't work (deadlock). Probably, because ROS time is not correctly published anymore.
We could also listen for TF messages "from the future" and trigger a reset automatically
That's what I implemented in a TF pull request. But it was kind of rejected - being too brittle.
Using this implementation doesn't really change anything for me. But calling rospy.rostime.wallsleep(0.1) after performing simulation steps seems to solve this issue.
@lbergmann1, please check whether this improves/resolves your ROS message issue. @DavidPL1, you mentioned a segfault when introducing this, right? I didn't observe anything so far.
However, I was sure I could remove the 100ms wait time on reset. But I can't. Actually, debugging showed that no ROS messages are processed by the
processAllRosMessages()
calls there. I think the reason is that an oldjoint_states
message needs to "arrive" at the remoterobot_state_publisher
, which then sends back (old)TF
messages. And this simply takes time, we cannot influence.