Open ahmeda14960 opened 2 years ago
In the libfranka error message, cartesian_reflex
refers to the robot end-effector exerting more force than the safety limits allow for. Common causes are the gripper bumping too hard into objects, or the user commanding the arm to move too fast.
I would need some more context of your experiment to know exactly what's happening, but in large-scale data collection processes this is quite hard to prevent, especially if the arm is interacting with objects.
A common workaround is to use is_running_policy
every time before sending pose updates. This checks if the robot has terminated the controller due to safety reasons, and if so you can recover by resetting and calling start_cartesian_impedance
or start_joint_impedance
again.
Hi,
Thanks for responding! I actually found a way around this and it was hidden within polymetis the whole time.
If anyone else runs into this issue instead of the launch robot command above, you should run the script at ~/fairo/polymetis/polymetis/python/scripts/persis_server.sh. This will make it so the server will be restarted if it gets killed, I ran this and was able to get the franka running for 14+ hours without issues!
Hello,
Sorry to re-open this issue, but it turns out I was wrong. I switched to a new task where the robot must pick up a cube through reinforcement learning (before we were doing goal reaching so no need for contact), and the error came up again.
Maybe for more context, the error gets thrown in the following function in my library. What's strange is that we are checking if the controller is not running and then starting impedance control but I'm guessing this is too naive an implementation. Do you have any thoughts on how to make this more robust, such that it never tries the update joint command without the cartesian controller running?
def update_pose(self, pos, angle):
'''Expect [x,y,z], [yaw, pitch, roll]'''
desired_pos = torch.Tensor(pos)
desired_quat = torch.Tensor(euler_to_quat(angle))
desired_qpos, success = self._ik_solver.compute(desired_pos, desired_quat)
feasible_pos, feasible_quat = self._robot.robot_model.forward_kinematics(desired_qpos)
feasible_pos, feasible_angle = feasible_pos.numpy(), quat_to_euler(feasible_quat.numpy())
if not self._robot.is_running_policy():
self._robot.start_cartesian_impedance()
self._robot.update_desired_joint_positions(desired_qpos)
return feasible_pos, feasible_angle
To clarify, the issue you were running into happens as follows:
libfranka: Move command aborted: motion aborted by reflex! ["cartesian_reflex"]
)[info] Terminating custom controller, switching to default controller.
)[warning] Tried to perform a controller update with no controller running.
)Theoretically you should not be updating without a controller running anymore, since you are now always checking to see if a controller is running. Can you confirm whether you are still seeing the error message in 3.?
The source of the issue, which I explained in the previous comment, cannot be removed programmatically since it has to do with the limits of the hardware itself. You can try to make your policy less aggressive to prevent the issue, but I would advise to wrap another loop around your script to reset the robot and maybe throw away the data collected in the current episode as a recovery mechanism to ensure that your experiment continues running.
So I checked the server and I did get the error message in 3. I reran a training loop while printing is_running_policy()
and the issue is that sometimes it returns true when the impedance controller is not running, which causes a crash. I believe this is a bug but I'm not sure how to fix it.
For anyone curious, my workaround is that I added a try-except clause to catch this failure mode, and then restart the server. Now I can run the script for many hours without it crashing due to the error above (although it's not ideal)
try:
self._robot.update_desired_joint_positions(desired_qpos)
except:
print('impedance controller failed, restarting it')
self._controller_restart += 1
print(f'controller reset tracker: {self._controller_restart}\n')
self._robot.start_cartesian_impedance()
self._robot.update_desired_joint_positions(desired_qpos)
Type of Issue
Select the type of issue:
Description
Hi all,
I'm currently running polymetis with a Franka robot and the Robotiq gripper. Everything is working as intended, except for when we run polymetis for longer periods of time (~2 hrs). We'd like to be able to run it continuously for many hours or even days as we are testing reinforcement learning algorithms
The issue seems to occur within the server / robot client with the following command
launch_robot.py robot_client=franka_hardware robot_client.executable_cfg.robot_ip=172.16.0.2
And everything will work as intended for two hours but eventually crashCurrent Behavior
When my training script crashes, I see the following output from the polymetis server
Expected Behavior
Not entirely sure what's to be expected, but when the experiment is running well, and the server is able to start the controller / switch to the default controller. For example, all I had to do with restart my training script and the server (with the same log as above) was able to continue working as follows:
Steps to reproduce
Not sure how to reproduce this other than maybe running a random policy for ~2hrs?