yujinrobot / kobuki_desktop

Visualisation and simulation tools for Kobuki
http://www.ros.org/wiki/kobuki_desktop
37 stars 58 forks source link

Add update_rate parameter for the gazebo_ros_kobuki plugin #61

Closed meyerj closed 4 years ago

meyerj commented 4 years ago

By default Gazebo runs the simulation at a rate of 1000Hz and also executes the update callback of all plugins in each cycle. Without any rate control within the plugin code, all publishers of the gazebo_ros_kobuki have been invoked at the full simulation rate. This does not only cause high CPU load for the simulation itself, but especially for each single TransformListener instance that subscribes to tf, given that publish_tf is enabled.

The real-world Kobuki platform publishes wheel odometry at 20 Hz, a much more reasonable publishing rate. I did not verify, but I assume other sensors (IMU, bumpers, cliff sensors) are processed synchronously. The update rate also affects how fast velocity commands are applied for the simulation, which keeps running at the full rate. So the patch introduces an additional worst-case latency of 1 / update_rate.

With update_rate set to 0 (or if the parameter is not present in the plugin configuration) the behavior is the same as before this patch.

Many plugins in package gazebo_plugins also implement rate control and some code snippets have been copied from there (e.g. gazebo_ros_p3d.cpp).

Some comparative results

tf's TransformListener (or the underlying Buffer) is a CPU burner. It builds the transform tree in each listener and keeps it up-to-date with each update, even though most listeners are only ever interested in a single transform (a single path in the tree), which might even be static.

The buffer_server executable in package tf2_ros is basically a single TransformListener and Buffer combined with an action server to lookup transforms. As long as no action client uses it, its CPU usage is representative for a single CPU and therefore it can serve as a benchmark. I also quickly checked with a minimal node that instantiates a TransformListener and the results are similar.

Tests have been done on an Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz, with the Kobuki Gazebo simulation running in the background and the latest ROS melodic release (roscpp 1.14.5).

No active tf publisher: => 4% CPU

$ timeout -s INT 10 time --verbose /opt/ros/melodic/lib/tf2_ros/buffer_server
        Command being timed: "/opt/ros/melodic/lib/tf2_ros/buffer_server"
        User time (seconds): 0.34
        System time (seconds): 0.07
        Percent of CPU this job got: 4%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.14
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 10600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 488
        Voluntary context switches: 4520
        Involuntary context switches: 4
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Even a trivial node roscpp without any publishers, subscribers or services and a single spinner consumes approx. 3-4% CPU (background threads?).

Publishing odom tf at 1000 Hz plus wheel joint states tf at 30 Hz (update_rate = 0) => 30% CPU

30 Hz is the default maximum publish_frequency of robot_state_publisher, even though joint states might come in faster than that. The robot_state_publisher itself also consumes a considerable amount of CPU.

$ timeout -s INT 10 time --verbose /opt/ros/melodic/lib/tf2_ros/buffer_server
        Command being timed: "/opt/ros/melodic/lib/tf2_ros/buffer_server"
        User time (seconds): 1.63
        System time (seconds): 1.40
        Percent of CPU this job got: 29%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.17
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 11432
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 726
        Voluntary context switches: 49819
        Involuntary context switches: 1492
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Almost 30% CPU load per TransformListener that does nothing! That is a lot for a real application.

Publishing odom tf at 50 Hz, plus wheel joint states tf at 25 Hz (update_rate = 50) => 20% CPU

$ timeout -s INT 10 time --verbose /opt/ros/melodic/lib/tf2_ros/buffer_server
        Command being timed: "/opt/ros/melodic/lib/tf2_ros/buffer_server"
        User time (seconds): 1.08
        System time (seconds): 0.97
        Percent of CPU this job got: 20%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.10
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 10508
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 528
        Voluntary context switches: 32631
        Involuntary context switches: 1027
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Not good, but at least we are down to 20% CPU usage...

Publishing odom and wheel joint states tf at 20 Hz (update_rate = 20) => 18% CPU

$ timeout -s INT 10 time --verbose /opt/ros/melodic/lib/tf2_ros/buffer_server
        Command being timed: "/opt/ros/melodic/lib/tf2_ros/buffer_server"
        User time (seconds): 0.94
        System time (seconds): 0.88
        Percent of CPU this job got: 18%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.11
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 10704
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 523
        Voluntary context switches: 32331
        Involuntary context switches: 1425
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
corot commented 4 years ago

Nice!