Closed tshumay closed 11 months ago
This is a great find. Thanks for the comprehensive report. We'll look into this on our end and implement a fix.
Please see if the changes in https://github.com/Interbotix/interbotix_ros_toolboxes/commit/6280950f8f6518e2c76e2f6134a3700423f70c7b fix the timing issue you describe.
For this change, we've also removed the timeout so the spinning will block until the futures are complete. This was done so users could be sure that the service to change the timing registers has returned before the arm begins moving.
Great! Thanks for the quick action on that. I will integrate these changes to day and test things out in the next day or so and let you know, but those fixes seem totally reasonable at first glance.
Just merged the code - the code in the arm.py seems correct, but in core.py, there are still several calls to self.executor.spin_once_until_future_complete(). Is that intentional?
That will be a separate change once we finish testing those other functions locally.
Great. In that case, I think this takes care of the bugs in the arm module I spotted. Thanks!
What happened?
I was noticing some odd behaviour on a robot I was working on that is derived from the Interbotix arms. Specifically, when using the joystick control / launch file with a PS4 controller (xsarm_joy.launch.py), it appeared that the velocity and acceleration overrides for waking and sleeping the robot only worked sometimes. The intended scenario seems to be that the joystick configuration sets default velocity and acceleration to 200/100, but the wake and sleep calls override those values to 1500/750 to avoid moving too fast.
When the service call to set the vel/accel fails, the robot arm attempts to perform the wake/sleep at the 200/100 rate which is really fast, and sometimes causes motor faults.
Robot Model
Custom Dynamixel robot - similar to WindowX
Operating System
Ubuntu 22.04
ROS Distro
ROS 2 Humble
Steps To Reproduce
On my ROS2 Humble install, I found that just waking and sleeping the robot a number of times with the game controller after launching the xsarm_joy.launch.py script would ultimately surface the bug. Sometimes, it is subtle because it may be the case that the velocity registers are written, and then the position, and then acceleration, so you don't notice it as badly. It's the worst case if both velocity and and acceleration register writes are deferred until after the position command though because the robot will move very quickly to/from home/sleep.
Relevant log output
Additional Info
I believe the culprit is the usage of self.core.executor.spin_once_until_future_complete() in the code below. I believe this function actually literally only does spin once, and thus, does not guarantee completion of the future. It's very poorly named in ROS, but I believe it is normally used internally by spin_until_future_complete(), which does ensure completion. I did a few tests, by placing an assert in the code afterward to ensure the future was done, and the code tripped the assert (
assert future_moving_time.done()
after the spin_once_until_future_complete() call).It's easy enough to fix this up, but there are a few other concerns here as well - the timeouts are another one. It's also theoretically possible for the calls to the register updates to timeout and fail, but the calling function won't know that they failed and will try to execute subsequent commands which may be undesirable.
Also, there seems to be a fair amount of usage of spin_once_until_future_complete in core.py, which might have similar issues. So, probably worth reviewing the other potential usage of it as well to make sure none of those cases are critical.