Trying to move my Panda with franka_ros (installed version 0.5.0-0xenial-20180724-052826-0800 from aptitude) I wanted to start with some examples and go from there. After tutorializing myself with actionlib and moveit, I decided to try your examples. At first I got a missing dependency error from PyAssimp, which I upgraded with pip. Afterwards, I have been stuck with this console output:

ap@ap-HP-630-PC:~$ roslaunch franka_example_controllers move_to_start.launch robot_ip:=172.16.1.101 load_gripper:=true
... logging to /home/ap/.ros/log/f343ac20-958b-11e8-ad71-009c021669e3/roslaunch-ap-HP-630-PC-18801.log
Checking log directory for disk usage. This may take awhile.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://ap-HP-630-PC:45795/

SUMMARY
========

PARAMETERS
 * /effort_joint_trajectory_controller/constraints/goal_time: 0.5

[...]

 * /robot_description_semantic: <?xml version="1....
 * /rosdistro: kinetic
 * /rosversion: 1.12.13

NODES
  /
    controller_spawner (controller_manager/spawner)
    franka_control (franka_control/franka_control_node)
    joint_state_publisher (joint_state_publisher/joint_state_publisher)
    move_group (moveit_ros_move_group/move_group)
    move_to_start (franka_example_controllers/move_to_start.py)
    robot_state_publisher (robot_state_publisher/robot_state_publisher)
    state_controller_spawner (controller_manager/spawner)

ROS_MASTER_URI=http://localhost:11311

process[franka_control-1]: started with pid [18824]
process[state_controller_spawner-2]: started with pid [18825]
process[robot_state_publisher-3]: started with pid [18826]
process[joint_state_publisher-4]: started with pid [18827]
process[controller_spawner-5]: started with pid [18830]
process[move_group-6]: started with pid [18831]
process[move_to_start-7]: started with pid [18832]
INFO: cannot create a symlink to latest log directory: [Errno 2] No such file or directory: '/home/ap/.ros/log/latest'
[INFO] [1533133292.965529]: Controller Spawner: Waiting for service controller_manager/load_controller
[INFO] [1533133293.008041]: Controller Spawner: Waiting for service controller_manager/load_controller
[ INFO] [1533133310.864421603]: Loading robot model 'panda'...
[ INFO] [1533133310.864582181]: No root/virtual joint specified in SRDF. Assuming fixed joint
[ INFO] [1533133311.434593432]: Loading robot model 'panda'...
[ INFO] [1533133311.434670318]: No root/virtual joint specified in SRDF. Assuming fixed joint
[ INFO] [1533133311.813252843]: Publishing maintained planning scene on 'monitored_planning_scene'
[ INFO] [1533133311.833453881]: MoveGroup debug mode is OFF
Starting context monitors...
[ INFO] [1533133311.964146375]: Starting scene monitor
[ INFO] [1533133311.969234875]: Listening to '/planning_scene'
[ INFO] [1533133311.969279904]: Starting world geometry monitor
[ INFO] [1533133311.974056849]: Listening to '/collision_object' using message notifier with target frame '/panda_link0 '
[ INFO] [1533133311.978706678]: Listening to '/planning_scene_world' for planning scene world geometry
[ INFO] [1533133311.992981585]: Listening to '/attached_collision_object' for attached collision objects
Context monitors started.
[ INFO] [1533133314.337135187]: Initializing OMPL interface using ROS parameters
[ INFO] [1533133314.375821047]: Using planning interface 'OMPL'
[ INFO] [1533133314.609941620]: Param 'default_workspace_bounds' was not set. Using default value: 10
[ INFO] [1533133314.610955241]: Param 'start_state_max_bounds_error' was set to 0.1
[ INFO] [1533133314.611712517]: Param 'start_state_max_dt' was not set. Using default value: 0.5
[ INFO] [1533133314.612581480]: Param 'start_state_max_dt' was not set. Using default value: 0.5
[ INFO] [1533133314.613295782]: Param 'jiggle_fraction' was set to 0.05
[ INFO] [1533133314.614061370]: Param 'max_sampling_attempts' was not set. Using default value: 100
[ INFO] [1533133314.614154158]: Using planning request adapter 'Add Time Parameterization'
[ INFO] [1533133314.614195411]: Using planning request adapter 'Fix Workspace Bounds'
[ INFO] [1533133314.614237036]: Using planning request adapter 'Fix Start State Bounds'
[ INFO] [1533133314.614278020]: Using planning request adapter 'Fix Start State In Collision'
[ INFO] [1533133314.614320265]: Using planning request adapter 'Fix Start State Path Constraints'
[ WARN] [1533133320.067844261]: Waiting for position_joint_trajectory_controller/follow_joint_trajectory to come up
[WARN] [1533133323.058336]: Controller Spawner couldn't find the expected controller_manager ROS interface.
[WARN] [1533133323.058337]: Controller Spawner couldn't find the expected controller_manager ROS interface.
[state_controller_spawner-2] process has finished cleanly
log file: /home/ap/.ros/log/f343ac20-958b-11e8-ad71-009c021669e3/state_controller_spawner-2*.log
[controller_spawner-5] process has finished cleanly
log file: /home/ap/.ros/log/f343ac20-958b-11e8-ad71-009c021669e3/controller_spawner-5*.log
[ WARN] [1533133326.068081795]: Waiting for position_joint_trajectory_controller/follow_joint_trajectory to come up
[ERROR] [1533133332.068512233]: Action client not connected: position_joint_trajectory_controller/follow_joint_trajectory
[ INFO] [1533133332.077124121]: Returned 0 controllers in list
[ INFO] [1533133332.097130356]: Trajectory execution is managing controllers
Loading 'move_group/ApplyPlanningSceneService'...
Loading 'move_group/ClearOctomapService'...
Loading 'move_group/MoveGroupCartesianPathService'...
Loading 'move_group/MoveGroupExecuteTrajectoryAction'...
Loading 'move_group/MoveGroupGetPlanningSceneService'...
Loading 'move_group/MoveGroupKinematicsService'...
Loading 'move_group/MoveGroupMoveAction'...
Loading 'move_group/MoveGroupPickPlaceAction'...
[ INFO] [1533133333.911268075]: Loading robot model 'panda'...
[ INFO] [1533133333.911336463]: No root/virtual joint specified in SRDF. Assuming fixed joint
Loading 'move_group/MoveGroupPlanService'...
Loading 'move_group/MoveGroupQueryPlannersService'...
Loading 'move_group/MoveGroupStateValidationService'...
[ INFO] [1533133334.262345647]: 

********************************************************
* MoveGroup using: 
*     - ApplyPlanningSceneService
*     - ClearOctomapService
*     - CartesianPathService
*     - ExecuteTrajectoryAction
*     - GetPlanningSceneService
*     - KinematicsService
*     - MoveAction
*     - PickPlaceAction
*     - MotionPlanService
*     - QueryPlannersService
*     - StateValidationService
********************************************************

[ INFO] [1533133334.262459784]: MoveGroup context using planning plugin ompl_interface/OMPLPlanner
[ INFO] [1533133334.262507076]: MoveGroup context initialization complete

You can start planning now!

[ INFO] [1533133334.464601255]: Loading robot model 'panda'...
[ INFO] [1533133334.464661115]: No root/virtual joint specified in SRDF. Assuming fixed joint
[ INFO] [1533133335.469617650]: Ready to take commands for planning group panda_arm.
[ INFO] [1533133335.482903448]: Combined planning and execution request received for MoveGroup action. Forwarding to planning and execution pipeline.
[ INFO] [1533133335.549885407]: Planning attempt 1 of at most 1
[ INFO] [1533133335.587650571]: Planner configuration 'panda_arm' will use planner 'geometric::RRTConnect'. Additional configuration parameters will be set when the planner is constructed.
[ INFO] [1533133335.720734117]: RRTConnect: Starting planning with 1 states already in datastructure
[ INFO] [1533133335.759458588]: RRTConnect: Created 4 states (2 start + 2 goal)
[ INFO] [1533133335.759551813]: Solution found in 0.039023 seconds
[ INFO] [1533133335.766524302]: SimpleSetup: Path simplification took 0.006710 seconds and changed from 3 to 2 states
[ INFO] [1533133335.788275683]: Returned 0 controllers in list
[ERROR] [1533133335.788402835]: Unable to identify any set of controllers that can actuate the specified joints: [ panda_joint1 panda_joint2 panda_joint3 panda_joint4 panda_joint5 panda_joint6 panda_joint7 ]
[ERROR] [1533133335.788452422]: Known controllers and their joints:

[ERROR] [1533133335.788538943]: Apparently trajectory initialization failed
[ INFO] [1533133335.789098467]: ABORTED: Solution found but controller failed during execution
================================================================================REQUIRED process [move_to_start-7] has died!
process has finished cleanly
log file: /home/ap/.ros/log/f343ac20-958b-11e8-ad71-009c021669e3/move_to_start-7*.log
Initiating shutdown!
================================================================================
[move_to_start-7] killing on exit
[move_group-6] killing on exit
[joint_state_publisher-4] killing on exit
[robot_state_publisher-3] killing on exit
[franka_control-1] killing on exit
[franka_control-1] escalating to SIGTERM
shutting down processing monitor...
... shutting down processing monitor complete
done

The almost exact same thing happens if I try to roslaunch franka_control franka_control.launch robot_ip:=172.16.1.101 load_gripper:=true, where the warning from Controller Spawner seems to kill the node.

I tried to rospack find controller_manager and I didn't find it, so I sudo apt-get install ros-kinetic-ros-control, but it didn't solve it. I checked both logs and they just repeat the same errors and warnings, I checked the launch files to try and see which things are not correctly loading and found where the exception from the warning is raised but the connected try block seems like it shouldn't give issues... Moreover, Panda has the external activation activated and the joints unlocked (worked for them). Environment variables seem on point, setup files are sourced.

Is it a namespace problem, where the controller is loaded but not where it should be? Or is it something else I don't yet understand? Should I build from source like suggested in this comment?

Hi,

please try sudo apt-get install ros-kinetic-controller-manager

"it's already the newest version (0.13.3-0xenial-20180426-054905-0800)"

Earlier this week I had a similar problem. From the dpkg log I can read that I installed ros-kinetic-moveit-commander that day. Maybe this can help you. Sadly I don't have access to the robot right now to test it myself.

Eh, unfortunately also this package is already up to date ("ros-kinetic-moveit-commander is already the newest version (0.9.12-1xenial-20180724-044717-0800)").

From looking into the code, I see controller_spawner is trying to load the controller with:

robot_namespace = ""   # weird?
load_controller_service = controller_manager/load_controller  # since robot namespace is empty...

and timing out on the line rospy.wait_for_service(...). So reading the launch file I see the problem lies in loading position_joint_trajectory_controller, which in fact I can't find yet. Trying a rosrun controller_manager load controller_manager position_joint_trajectory_controller gives the exact same result, so am I missing said controller?

Google doesn't give me that specific controller but it does show "joint_trajectory_controller" which means I tried sudo apt-get install ros-kinetic-ros-controllers and still failed.

Googling kind of helped when I tried with only "position_joint_trajectory_controller" and it resulted in giving me this thread. The thread discusses a similar problem but with a simulated environment. I feel like there's a link between my issue and theirs, but still, the launch file of move_to_start finds on its own the panda_moveit_config files, which means it should be finding (if I go find /opt/ros/ -name panda_moveit_config -print the result is only that one) the one from the moveit tutorials themselves, which is the same "updated" code @christgau is talking about.

EDIT: I never used the ones from the git repository, I realized, I used the ones from aptitude and the franka_ros package. Thus I tried to get the git panda_moveit_config and build it, the result is now if I execute roslaunch franka_example_controllers move_to_start.launch robot_ip:=172.16.2.101 I get the error unused args [load_gripper] for include of [/home/ap/catkin_ws/src/panda_moveit_config/launch/move_group.launch], meaning the current launch file is incompatible without edits.

I deleted the catkin_ws and made a brand new one to build from source both libfranka and franka_ros as per this. After also updating the firmware of the controller (I built libfranka version 3, controller was on 2), it seems like the problem with the controller seems solved, but Panda still doesn't want to move and gives me this error:

[ERROR] [1533279509.105783637]: libfranka: Move command rejected: command not possible in the current mode!

I tried already the normal procedure to no avail. After the error, MoveGroups are started, and then:

You can start planning now!

[ INFO] [1533280073.533212783]: Loading robot model 'panda'...
[ INFO] [1533280073.533296439]: No root/virtual joint specified in SRDF. Assuming fixed joint
[ INFO] [1533280073.664434618]: Loading robot model 'panda'...
[ INFO] [1533280073.664499532]: No root/virtual joint specified in SRDF. Assuming fixed joint
[ INFO] [1533280074.794115132]: Ready to take commands for planning group panda_arm.
[ INFO] [1533280074.811005253]: Combined planning and execution request received for MoveGroup action. Forwarding to planning and execution pipeline.
[ INFO] [1533280074.811424363]: Planning attempt 1 of at most 1
[ INFO] [1533280074.814041853]: Planner configuration 'panda_arm' will use planner 'geometric::RRTConnect'. Additional configuration parameters will be set when the planner is constructed.
[ INFO] [1533280074.815060771]: RRTConnect: Starting planning with 1 states already in datastructure
[ INFO] [1533280074.827086437]: RRTConnect: Created 5 states (2 start + 3 goal)
[ INFO] [1533280074.827190102]: Solution found in 0.012383 seconds
[ INFO] [1533280074.836727831]: SimpleSetup: Path simplification took 0.009322 seconds and changed from 4 to 2 states
> [ WARN] [1533280074.845116910]: Dropping first 1 trajectory point(s) out of 10, as they occur before the current time.
> First valid point will be reached in 0.428s.
> [ WARN] [1533280077.196332434]: Controller position_joint_trajectory_controller failed with error code GOAL_TOLERANCE_VIOLATED
> [ WARN] [1533280077.196555486]: Controller handle position_joint_trajectory_controller reports status ABORTED
[ INFO] [1533280077.277483551]: Completed trajectory execution with status ABORTED ...
[ INFO] [1533280077.439731474]: ABORTED: Solution found but controller failed during execution
================================================================================REQUIRED process [move_to_start-7] has died!
process has finished cleanly
log file: /home/ap/.ros/log/90eae6c8-96ea-11e8-b6c2-009c021669e3/move_to_start-7*.log
Initiating shutdown!
================================================================================
[move_to_start-7] killing on exit
[move_group-6] killing on exit
[controller_spawner-5] killing on exit
[joint_state_publisher-4] killing on exit
[robot_state_publisher-3] killing on exit
[state_controller_spawner-2] killing on exit
[franka_control-1] killing on exit
[INFO] [1533280078.606341]: Shutting down spawner. Stopping and unloading controllers...
[INFO] [1533280078.607032]: Stopping all controllers...
[INFO] [1533280078.608018]: Shutting down spawner. Stopping and unloading controllers...
[INFO] [1533280078.609073]: Stopping all controllers...
[WARN] [1533280078.966768]: Controller Spawner error while taking down controllers: unable to connect to service: connection from sender terminated before handshake header received. 0 bytes were received. Please check sender for additional details.
[WARN] [1533280078.985987]: Controller Spawner error while taking down controllers: unable to connect to service: [Errno 104] Connection reset by peer
shutting down processing monitor...
... shutting down processing monitor complete
done

So it looked like there's a slow connection or something. But the network load test returns:

--- 172.16.2.101 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 10068ms
rtt min/avg/max/mdev = 0.226/0.492/0.666/0.047 ms

...which means I'm under the milli and it should be fine? CPU is on performance mode as well.

EDIT: I ran echo_robot_state from libfranka examples since I now have it available and, after re-publishing

rostopic pub -1 /franka_control/error_recovery/goal franka_control/ErrorRecoveryActionGoal "{}"

I get :

{
"O_T_EE": [0.992893,0.0871198,-0.0809555,0,0.0871759,-0.996179,-0.00284745,0,-0.0808958,-0.00423023,-0.996714,0,0.524795,0.00722957,0.294793,1], 
"O_T_EE_d": [0.992913,0.0874344,-0.0803735,0,0.0874943,-0.996151,-0.00278373,0,-0.0803091,-0.00426831,-0.996761,0,0.524888,0.00738594,0.295042,1], 
"F_T_EE": [0.7071,-0.7071,0,0,0.7071,0.7071,0,0,0,0,1,0,0,0,0.1034,1], 
"EE_T_K": [1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1], 
"m_ee": 0.73, 
"F_x_Cee": [-0.01,0,0.03], 
"I_ee": [0.001,0,0,0,0.0025,0,0,0,0.0017], 
"m_load": 0, 
"F_x_Cload": [0,0,0], 
"I_load": [0,0,0,0,0,0,0,0,0], 
"m_total": 0.73, 
"F_x_Ctotal": [-0.01,0,0.03], 
"I_total": [0.001,0,0,0,0.0025,0,0,0,0.0017], 
"elbow": [0.008815,-1], 
"elbow_d": [0.0089711,-1], 
"tau_J": [0.397578,-29.7917,0.315179,22.0217,0.493995,2.21358,-0.0449972], 
"tau_J_d": [0,0,0,0,0,0,0], 
"dtau_J": [53.9998,-87.6941,27.9825,-21.1357,8.50055,8.72733,42.5273], 
"q": [0.00593191,0.0106994,0.008815,-2.1266,-0.0035399,2.05626,0.714524], 
"dq": [0.000317997,0.00111618,-0.000285925,0.000521088,-0.000269897,-0.000209827,-0.00152636], 
"q_d": [0.00607774,0.0103849,0.0089711,-2.12642,-0.00356418,2.05635,0.714521], 
"dq_d": [0,0,0,0,0,0,0], 
"joint_contact": [0,0,0,0,0,0,0], 
"cartesian_contact": [0,0,0,0,0,0], 
"joint_collision": [0,0,0,0,0,0,0], 
"cartesian_collision": [0,0,0,0,0,0], 
"tau_ext_hat_filtered": [0.336855,-0.339606,0.224505,0.304036,-0.158596,0.0804715,-0.0700333], 
"O_F_ext_hat_K": [0.047224,0.373813,0.587859,-0.303382,-0.324877,0.281773], 
"K_F_ext_hat_K": [0.0318651,-0.369949,-0.591328,-0.205633,0.0127268,-0.0695569], 
"O_dP_EE_d": [0,0,0,0,0,0], 
"theta": [0.00596015,0.00860947,0.00883637,-2.12505,-0.00348501,2.0565,0.714519], 
"dtheta": [0,0,0,0,0,0,0], 
"current_errors": [], 
"last_motion_errors": ["communication_constraints_violation"], 
"control_command_success_rate": 0, 
"robot_mode": "Reflex", 
"time": 2976499}

Hi,

happy to see that building from source solved some problems.

Your robot seems to be in an error state and stopped due to a reflex (e.g. self collision). If you can't recover the error via /franka_control/error_recovery/ please try it via the Franka desk UI.

Most errors can be solved with calling /franka_control/error_recovery/goal. From my experience there are some errors where the web interface is needed for resolving them, like severe self collisions.

Hi Chris,

I am in fact continually trying to recover from errors, with

rostopic pub -1 /franka_control/error_recovery/goal franka_control/ErrorRecoveryActionGoal "{}"

and I get in return (right after executing the command)

[ INFO] [1533290850.740965283]: Recovered from error
[ERROR] [1533290851.074941978]: libfranka: Move command aborted: motion aborted by reflex! ["communication_constraints_violation"]
control_command_success_rate: 0.43

I did change some cables (upgraded to a Cat. 6 ethernet cable between pc and control), and the result is that now it's slightly moving but immediately stopping due to a Cartesian Reflex.

Desk is not letting me recover from errors sometimes, and it is losing the gripper after some of the tests I'm doing, so I have to shutdown and restart at the moment. This happens frequently with a moveit example using rviz. It would help if there was a js command that forced Desk to perform error recovery even if it doesn't seem available.

By the way, no self collisions are happening, or collisions at all for that matter: the arm is barely moving at all.

For example, right now this happened:

everything is completely shut down;
I start the controller unit so the arm boots up;
I unlock the brakes and open the external activation device afterwards;

I open 5 terminal tabs, and I launch these 5 commands one per terminal:

roscore
roslaunch franka_control franka_control.launch robot_ip:=172.16.2.101 load_gripper:=true
roslaunch franka_gripper franka_gripper.launch robot_ip:=172.16.2.101
roslaunch panda_moveit_config panda_moveit.launch controller:=position load_gripper:=true
roslaunch panda_moveit_config moveit_rviz.launch

everything launches without warnings or errors at all;
10-15 seconds later, without me doing anything I hear an "error recovery" sound, I go check Desk and the gripper's light became red, and the franka_control terminal showed [ERROR] [1533292611.025196268]: libfranka: Move command aborted: motion aborted by reflex! ["communication_constraints_violation"] control_command_success_rate: 0.42;
now I have to restart everything again from scratch, might as well reboot the pc this time.

Apart from this crazy event though, it seems like progress is being made, but I think I'm missing something crucial.

EDIT: As crazy as that sounds, this happens consistently even after rebooting. I had not noticed but the rviz tab is also warning me that: [ WARN] [1533295604.046500154]: Interactive marker 'EE:goal_panda_hand' contains unnormalized quaternions. This warning will only be output once but may be true for others; enable DEBUG messages for ros.rviz.quaternions to see more details.

I tried to call the service to set_logger_level "logger: 'ros.rviz.quaternions' level: 'debug'" and the get_loggers shows it's set correctly, but nothing shows up in the terminal tab.

The unnormalized quaternions aren't the problem. The communication_constraints_violation is.

Can you move the robot arm and use the hand via Desk?

Also, lets try moveit without the arm

roscore
roslaunch franka_control franka_control.launch robot_ip:=172.16.2.101 load_gripper:=false
roslaunch panda_moveit_config panda_moveit.launch controller:=position load_gripper:=false
roslaunch panda_moveit_config moveit_rviz.launch

Yes I can, I made a simple pick and place task that I run every time I change the hardware configuration, and so far no issues there.

MoveIt without the arm doesn't load without giving me the usual [ERROR] [1533301052.932503680]: libfranka: Move command aborted: motion aborted by reflex! ["communication_constraints_violation"] control_command_success_rate: 0.28 and rviz the warning on the unnormalized quaternions (on link8 this time...makes sense since no gripper was loaded).

While the warning happens while loading rviz, the error happens systematically but after a random amount of time: the milli threshold is crossed when all nodes are up but not necessarily right away.

In regards to the suggestions given by Franka Emika, I am 100% on N°1, 2, 3 (actually no, see next paragraph), and 5. N° 4 seems more like a statement than a thing I can change, I don't get it. For N°6 I tried robot.setFilters(50, 50, 50, 50, 50) right after robot is instantiated in franka_control_node, and it lead to execution errors. I don't really know where I would edit in the filters thresholds since I haven't yet written my own package and I'm just trying the examples.

I retried the bandwidth test with all the nodes active and the issue is right there:

--- 172.16.2.101 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 10251ms
rtt min/avg/max/mdev = 0.115/0.357/2.499/0.071 ms

This is after a cable change, so that's the reason why the min and avg are lowered. The max with all nodes up and running is significantly higher. I am however out of options, am I not? Is there something I'm missing?

Some parts of libfranka are running with 1kHz and need to communicate with the FCI, therefore you need to be below 1ms rtt.

Check if you are running a realtime linux kernel: uname -r should show rt as part of the output, e.g. 4.13.12-rt5
disable CPU frequency scaling
always connect your workstation PC directly to the FCI (without switches, routers, ... in between)

If you're saying that with just the arm ready to move (e.g. brakes open ext.act. open, ready in Desk) and nothing else going on, I need a max rtt<1ms, I do have it:

$ sudo ping 172.16.2.101 -i 0.001 -D -c 10000 -s 1200

[...]

--- 172.16.2.101 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 10017ms
rtt min/avg/max/mdev = 0.150/0.412/0.543/0.047 ms

and then:

$ uname -r
4.14.12-rt10
$ cpufreq-info -p
933000 2533000 performance

Finally I only have a 2m cat. 6 RJ-45 cable separating this pc and 172.16.2.101.

Where else could the problem be? I will check if my issue has become a duplicate of this issue or this issue), otherwise I welcome any other suggestion.

The 1ms constraints applies to more than just the RRT, you'll find more details here. The results of the Network bandwidth, delay and jitter test should not be significantly higher than rtt min/avg/max/mdev = 0.147/0.240/0.502/0.038 ms. What type of network card are you using? I've experienced a similar high avg ping time with an USB NIC.

Please try out the ping-test with an other PC and post the results.

First, some data about my network setup:

$ lspci | awk '/[Nn]et/ {print $1}' | xargs -i% lspci -ks %
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
    DeviceName: Hanksville Gbe Lan Connection
    Subsystem: Hewlett-Packard Company RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
    Kernel driver in use: r8169
    Kernel modules: r8169
02:00.0 Network controller: Qualcomm Atheros AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
    DeviceName: Atheros AR3011 Bluetooth 2.1+EDR  Adapter 
    Subsystem: Hewlett-Packard Company AR9285 Wireless Network Adapter (PCI-Express)
    Kernel driver in use: ath9k
    Kernel modules: ath9k

Ok, so the 1ms constraint is relative to the sum of: 1) Round trip time (RTT) between the workstation PC and FCI; 2) Execution time of your motion generator or control loop; 3) Time needed by the robot to process your data and step the internal controller.

Regarding 1, my worst RTT seems to be 0.543 ms. The average trip is much higher than yours, but the rest of the data doesn't seem too different. Let's see the rest. For 2, I see a [ INFO] [1533539326.993379637]: Solution found in 0.009804 seconds and a [ INFO] [1533539326.999867420]: SimpleSetup: Path simplification took 0.006222 seconds and changed from 4 to 2 states, so that seems to be already a problem. Moreover, the combo with 3 is also killing it, as the warning says [ WARN] [1533539327.026003833]: Dropping first 1 trajectory point(s) out of 10, as they occur before the current time. First valid point will be reached in 0.427s.

This puts the sum at 443,569 ms, if I'm not mistaken! My network setup is fine, it's the CPU that is much too slow in calculating a solution. Could you confirm this is the bottleneck and I didn't make mistakes in my assumptions and calculations? Could you maybe show me your output of these 3 lines, to confirm I need to buy and setup a more powerful pc?

EDIT: My CPU stats:

$ cat /proc/cpuinfo  | grep 'name'| uniq
model name  : Intel(R) Core(TM) i3 CPU       M 380  @ 2.53GHz
$ cat /proc/cpuinfo  | grep process| wc -l
4

Trying to understand better the problem, I find I'm not getting unreasonable calculation times (point 2 of your 1ms list):

this comment says 15 ms or 400 ms depending on the platform used;
this comment shows 30ms to 2000-15000ms planning times depending on if he uses a simulation or a real robot;
this presentation's slide 25 shows average times always above 500ms;
this video shows at 10:12 the solving time to be 54.227ms and simplification time being 7.439ms, and again at 10:53 solving takes 168.032ms and simplifying takes 129.958ms.

With the caveat of these being at least 1 year old, I'm bound to think I'm looking at the wrong numbers or misunderstanding something. I also found a real time motion planning algorithm from 2 years ago: is a real-time motion planning the one I'm supposed to use? But then it would not be visualizable in RViz/deal with obstacles (source), which the launch file is launching/the other simulation is planning around.

From my understanding, the time that MoveIt takes to create a motion plan does not count into the 1ms constraint mentioned before. The calculated (and simplified) plan is executed by the actual robot controller (or fake controller in case of simulation) and this component along with the involved network must not violate the 1ms constraint. Looking at the article and the graphics in https://frankaemika.github.io/docs/libfranka.html note that they deal with libfranka and MoveIt/franka_ros is only shown in the graphics for sake of a better understanding.

I see what you mean, I can definitely disregard the 9.804+6.222ms. Do we have any means to see the time the controller is taking to execute the plan? It seems the 427ms could be a combination of 2+3, by what the warning says. Any ways I can act on that that I'm not seeing through node/topic/service exploration?

I am not aware of any means for instrumenting controllers. If you built everything from source you could enable debug symbols and attach perf to the appropriate processes and check what's going on. In addition you could add your own profiling code. But be aware: Adding code, especially I/O stuff (like printfs), can easily break the hard realtime requirements.

Anyways, when having access to the robot I do not observe the problems with the execution of trajectories as you do - at least after the robot configuration has been fixed;-)

Franka Emika updated yesterday the libfranka version. After rebuilding and running the libfranka/examples/communication_test.cpp it seems clear (to franka emika support) that I need to upgrade my notebook's ethernet NIC as it seems my Realtek one is the fault. I'll update the issue once I have a solution in this sense.

The hardware upgrade worked for the communications test, which now starts, (which is already an improvement) and returns:

Finished test, shutting down example

#######################################################
The control loop did not get executed 35 times in the
last 10000 milliseconds! (lost 35 robot states)

Control command success rate of 9965 samples: 
Max: 1.00
Avg: 0.97
Min: 0.81
#######################################################

However the lost states are never 0, and I found they range between 35 and 194 (based on some runs in 2 days).

Moreover, trying roslaunch franka_example_controllers move_to_start.launch robot_ip:=172.16.2.101 load_gripper:=false doesn't work and, again, gives problems with the controllers.

EDIT: added the error

[INFO] [1536678512.176718]: Controller Spawner: Waiting for service controller_manager/load_controller
[INFO] [1536678512.269210]: Controller Spawner: Waiting for service controller_manager/load_controller
[ WARN] [1536678517.079633385]: Waiting for position_joint_trajectory_controller/follow_joint_trajectory to come up
[ WARN] [1536678523.079815712]: Waiting for position_joint_trajectory_controller/follow_joint_trajectory to come up
[ERROR] [1536678529.080127158]: Action client not connected: position_joint_trajectory_controller/follow_joint_trajectory

I went directly with a source install of libfranka and franka_ros, and I installed everything I had installed following my own comments (ros-kinetic-ros-controllers, ros-kinetic-moveit-commander, ros-kinetic-moveit, ros-kinetic-ros-control, ros-kinetic-joint-state-controller, ros-kinetic-controller-manager, and even ros-kinetic-gazebo-ros-pkgs, ros-kinetic-gazebo-ros-control and ros-kinetic-control-msgs).

The only different thing was PyAssimp again was giving troubles but eventually it seemed to work in the same way as last time. This time I had it installed with pip (so I have version 4.1.3) but sudo apt-get install ros-kinetic-moveit-commander was still complaining and installing it via aptitude, so I patched that one (as per this) (Actually while writing the post I went and looked in the old pc and the configuration is the same, with both versions installed, except the one in the old pc isn't patched)

I searched the old pc for potential differences and installed all the packages that were installed over there. The only remaining thing is I have a newer kernel: I had 4.14.12, I have 4.16.18, and this is due to the fact that I need compatibility with the Intel RealSense D435, which doesn't support 4.14.xx; therefore I looked in the patchable kernels in the list and installed a patchable one that is also supported by intel (4.[4,8,10,13,15,16]] from here)

Could it be a namespace problem when the controller spawn? Could it be a kernel issue? Do you know for a fact of other kernels that work with libfranka and are in the intel list?

I can confirm it was the kernel: today I built 4.14.12 (the one you use as default in the installation instructions and I tried both the communication_test and the move_to_start.launch:

the communication_test yields much better results, with 19 to 42 lost states in four tests I ran.
move_to_start worked perfectly with only a couple warnings:
```
You can start planning now!
```

[ INFO] [1536744422.884976038]: Loading robot model 'panda'... [ INFO] [1536744422.885035751]: No root/virtual joint specified in SRDF. Assuming fixed joint [ INFO] [1536744422.950890060]: Loading robot model 'panda'... [ INFO] [1536744422.950962544]: No root/virtual joint specified in SRDF. Assuming fixed joint [ INFO] [1536744424.065693131]: Ready to take commands for planning group panda_arm. [ INFO] [1536744424.092800395]: Combined planning and execution request received for MoveGroup action. Forwarding to planning and execution pipeline. [ INFO] [1536744424.093080716]: Planning attempt 1 of at most 1 [ INFO] [1536744424.094276926]: Planner configuration 'panda_arm' will use planner 'geometric::RRTConnect'. Additional configuration parameters will be set when the planner is constructed. [ INFO] [1536744424.094979827]: RRTConnect: Starting planning with 1 states already in datastructure [ INFO] [1536744424.106195303]: RRTConnect: Created 4 states (2 start + 2 goal) [ INFO] [1536744424.106227555]: Solution found in 0.011566 seconds [ INFO] [1536744424.108601791]: SimpleSetup: Path simplification took 0.002272 seconds and changed from 3 to 2 states [ WARN] [1536744424.126490107]: Dropping first 1 trajectory point(s) out of 10, as they occur before the current time. First valid point will be reached in 0.345s. [ INFO] [1536744425.759154881]: Completed trajectory execution with status SUCCEEDED ... ================================================================================REQUIRED process [move_to_start-8] has died! process has finished cleanly log file: /home/ap/.ros/log/3be1fc3c-b66d-11e8-b60a-d45ddf1a2746/move_to_start-8*.log Initiating shutdown!

[move_to_start-8] killing on exit [move_group-7] killing on exit [controller_spawner-6] killing on exit [joint_state_desired_publisher-5] killing on exit [joint_state_publisher-4] killing on exit [robot_state_publisher-3] killing on exit [state_controller_spawner-2] killing on exit [franka_control-1] killing on exit [INFO] [1536744426.971122]: Shutting down spawner. Stopping and unloading controllers... [INFO] [1536744426.971438]: Stopping all controllers... [INFO] [1536744426.972181]: Shutting down spawner. Stopping and unloading controllers... [INFO] [1536744426.972547]: Stopping all controllers... [WARN] [1536744427.133722]: Controller Spawner error while taking down controllers: unable to connect to service: [Errno 104] Connection reset by peer [WARN] [1536744427.134245]: Controller Spawner error while taking down controllers: unable to connect to service: [Errno 104] Connection reset by peer shutting down processing monitor... ... shutting down processing monitor complete done

Seems to me that I can safely ignore them, but feel free to tell if this is not the case.

Thank you and support for your help in trying to figure out the issue was due to a necessary hardware upgrade.

I didn't know kernel choice could be an issue, and at this point I reiterate my interest in knowing your experience with linux kernels and if you know of older compatible kernels or if you previously had discovered which options can configure it for optimal use beside the "> 5. Fully Preemptible Kernel (RT) (PREEMPT_RT_FULL) (NEW)".

It could be beneficial to understand/test which kernels actually work since, apart from the default one from your instructions, users are presented with a rather wide choice of preemptible kernels on the website, and nothing is currently stopping them from choosing another one and not having a working build afterwards. Your "We recommend choosing the version closest to the one you currently use." might actually be misleading.

I'd edit installation.rst:

First, install the necessary dependencies:
apt-get install build-essential bc curl ca-certificates fakeroot gnupg2 libssl-dev lsb-release

apt-get install build-essential bc curl ca-certificates fakeroot gnupg2 libssl-dev lsb-release bison flex

since bison and flex were always needed when building the kernel in my experience.

I'd add a disclaimer about the choice of the kernel or list the kernels actually tested to be compatible, right under this dependencies section.

Moreover franka_example_controllers (or franka_control) should have an exec_depend for pyassimp, ros-kinetic-ros-controllers, ros-kinetic-ros-control and maybe ros-kinetic-moveit-commander so that both with apt-get install franka_ros and via rosdep after installing franka_ros from source, the package works out of the box. The thing with pyassimp is, theoretically ros-kinetic-panda-moveit-config already depends (both in apt and in package.xml) on ros-kinetic-moveit-commander so pyassimp should already be there...except it wasn't.

@aPonza wrote:

Moreover franka_example_controllers (or franka_control) should have an exec_depend for pyassimp, ros-kinetic-ros-controllers, ros-kinetic-ros-control

Actually, ros-kinetic-ros-controllers and ros-kinetic-ros-control are metapackages, and those should not be used for dependency declaration (as you'll depend only transitivily on the pkgs that you actually need, and if the metapackage ever changes, your dependency declaration will be broken).

To update the thread with the (pretty unscientific) testing I have done:

4.4.148 doesn't support my wifi card (8086:24fd is supported from 4.6 onward). Prior to realizing this, I ran the communication test and it gave the best results out of all the kernels I tried with consistently low packet losses (between 11 and 38 in 10 runs), consistently high mins (might have been luck, but no min was under 0.90). I also had a bugged run which I'll address in a separate issue. I didn't go further to test librealsense with it.
4.8.15 gave the second best results, with packet losses between 7 and 50, just once a .48 min but other than that consistently high). It is a viable option to work with both panda and the realsense stereo cameras.
4.13.13 gave the fourth best results, with packet losses between 20 and 42, high averages, but frequent runs with a low min (lowest was .39 which triggered the warning on the setup). It is technically a viable option to work with both panda and a realsense stereo camera.
4.16.18 actually works...no idea what I installed/modified but it works and gives the third best results (6 to 65, lowest min was a .62). It is technically a viable option to work with both panda and a realsense stereo camera.

I think I will use 4.8, as it's a significant enough improvement over 4.16 with respect to the consistency of high averages and mins.

Apart from the minor edits I suggested in the very end of my last comment, this can be closed.

frankaemika / franka_ros

Launching franka_example_controllers examples (e.g. move_to_start.launch) #23