heuristicus / spot_ros

ROS driver for controlling Boston Dynamics' Spot robot
https://heuristicus.github.io/spot_ros/
Other
275 stars 143 forks source link

Failing to take commands after about 5 minutes - getting "ExpiredError" #74

Open dberrett opened 2 years ago

dberrett commented 2 years ago

We're hitting an issue where after about 5 minutes, Spot starts jittering/stumbling and eventually refuses to move. The velocity messages are making it to the driver, but it appears that the clocks are getting out of sync, and I progressively get more and more "ExpiredError" failures, until eventually every message is rejected. Restarting the driver script fixes the issue for another 5 minutes. Is there more info on how the clock-sync works, and why we might be hitting this? Running ROS Noetic on Windows 11 WSL2 - connected/controlling via Wifi.

The error: (ExpiredError): The command was received after its max_duration had already passed.

heuristicus commented 2 years ago

The clock skew between the robot and the driver is provided by a call to the robot's time sync endpoint (self._robot.time_sync.endpoint.clock_skew), and is used in various places when sending commands.

https://github.com/clearpathrobotics/spot_ros/blob/a09e6add6b0ed192ffd60b19c64055d08505d96d/spot_driver/src/spot_driver/spot_wrapper.py#L382-L404

You're talking about velocity messages, so I assume that you're talking about the velocity command which is at

https://github.com/clearpathrobotics/spot_ros/blob/a09e6add6b0ed192ffd60b19c64055d08505d96d/spot_driver/src/spot_driver/spot_wrapper.py#L569-L572

When the command is generated the end time and the timesync endpoint are passed to

https://github.com/boston-dynamics/spot-sdk/blob/1c3be3f006a4d0233dbe0946a0e53d139774cb9b/python/bosdyn-client/src/bosdyn/client/robot_command.py#L443-L475

Are you able to replicate the expired error with commands other than the velocity command? The trajectory command also explicitly sets an end time, but does not pass the endpoint. If one command works, but not the other, then perhaps this is a useful comparison to have. I haven't personally use the velocity command at all, but I use the trajectory command all the time, and haven't seen this error come up.

I wonder if there is also potential for WSL to be causing some issues, but I don't know how that functions. Are you able to test with a linux install as well? Might be useful for comparison purposes.

FabianEP11 commented 1 year ago

Hi, I'm having the same issue as dberrett described. The velocity command works for less than 5 min and then the robot just refuse to move with the error: Unable to execute robot command: bosdyn.api.RobotCommandResponse (ExpiredError): The command was received after its max_duration had already passed. The /go_to_pose topic works fine though.

heuristicus commented 1 year ago

What setup are you using to control the robot? I wonder if it's possible that network latency or something causes clocks to drift out of synchronisation. Is it only the velocity command which does not function?

FabianEP11 commented 1 year ago

Hey, I'm using the 3.2.3 version... Basically, I just ssh into the robot, run the driver, claim the lease, power on, stand the robot and send cmd_vel (from the terminal). It works just for a while and then the ExpiredError appears. For now, I'm testing just the /cmd_vel and /go_to_pose topics, so I'm not sure if the problem is also present on any other functionality.