frankaemika / libfranka

C++ library for Franka research robots
https://frankaemika.github.io
Apache License 2.0
221 stars 147 forks source link

Timing (per wireshark) indicates that FCI comms may have some unexpected jitter? #120

Open EricCousineau-TRI opened 1 year ago

EricCousineau-TRI commented 1 year ago

See comment below, https://github.com/frankaemika/libfranka/issues/120#issuecomment-1206789601, for updated results.


While doing some timing, I noticed there was some odd jitter from libfranka when using Robot::read() and Robot::control(). When digging in just a bit, it looked like most of the time was spent reading status during RobotImpl::receiveRobotState, more specifically in Network::udpBlockingReceive. I noticed same results with and without running a RT_PREEMPT-patched kernel.

I checked the ping rate per troubleshooting docs, and it seemed fine:

sudo ping <ip> -i 0.001 -D -c 10000 -s 1200
...
--- <ip> ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 9999ms
rtt min/avg/max/mdev = 0.074/0.114/0.214/0.009 ms

We then inspected the "raw" UDP receive timing with Wireshark when running Robot::read(), and noticed the same large amount of jitter. (The following is on Ubuntu 20.04, not on a RT_PREEMPT machine - we do not believe it matters, as is shown below) image (^^ real robot)

We have tested this on 3 arms - one arm running System Version 3.2.0, and two other arms running System Version 4.2.1. We see about the same behavior.

Then, we tried using a simple read_robot_state on host + RobotMockServer on a separate machine using this code: https://github.com/EricCousineau-TRI/libfranka/commit/ba30c2cd6b0 The only difference between the tests is that we unplugged the cable from the Panda control box and plugged it into the separate machine (w/ network config to match the Panda), and then reran the testing, and we got a much "nicer" looking plot: image (^^ RobotMockServer on separate machine)

This seems very odd.

Is there someway to configure the arm to reduce this jitter? Is there something else we're missing? Or in the off chance of a small bug, is there a patch that can be released to fix this?


For Wireshark, we identified network interface for the related arm, ran the libfranka driver, then captures about 10s worth of traffic, and exported it using File > Export Packet Dissections > As CSV... Then we plotted it using simple code like this:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

timing_file = "/path/to/data.csv"
df = pd.read_csv(timing_file)
udp = df["Protocol"] == "UDP"
ts = np.asarray(df[udp]["Time"])
ts = ts[ts <= 10.0]
plt.plot(ts[1:], np.diff(ts))
plt.ylabel("dt (sec)")
plt.xlabel("time (sec)")
plt.xlim(0, 10.0);
plt.savefig(f"{timing_file}.png")

Both machines are running Ubuntu 20.04, and not using RT_PREEMPT. As shown above, we do not believe RT_PREEMPT is necessary on either machine to show this discrepancy. Consider using taskset and chrt -r 20 with approrpriate permissions for both driver and fake server.

EDIT: For completeness, though, will try to reproduce above experiment on RT_PREEMPT kernel, with actual minimal code.

EricCousineau-TRI commented 1 year ago

First codified the experiment procedure here: https://github.com/EricCousineau-TRI/franka_panda_udp_timing/tree/7a5dd626cee49b2842c38b3f0ff664396d764183

Then ran with three machine setups (localhost, soft, realtime) x two "robot" setups (robot_fake, robot_real). see YAML files in repo for full explanation

These are still the results: image

jc211 commented 8 months ago

@EricCousineau-TRI Were you able to find any way of reducing the jitter?