ros2 / rcl_logging

Logging implementations for ROS 2.
Apache License 2.0
22 stars 35 forks source link

change flushing behavior for spdlog log files, and add env var to use old style (no explicit flushing) #95

Closed wjwwood closed 2 years ago

wjwwood commented 2 years ago

While working on a project we noticed some applications have missing log messages or no log file at all when it crashes. This isn't a good experience for users when debugging, so this pr changes the flushing behavior of spdlog to flush every five seconds as well as anytime the error log level is used.

A new environment variable is introduced to allow users to opt-out of this back to the old behavior (no explicit configuration) and is meant to be temporary until we support full config files (planned work).

The new env var is called RCL_LOGGING_SPDLOG_EXPERIMENTAL_OLD_FLUSHING_BEHAVIOR and setting it to 1 will get you the old behavior. I'm very open to other names, but the "EXPERIMENTAL" is in there in part to indicate we would like to remove this in the future in favor of the config files, i.e. this is experimental and you cannot rely on it in the next version of ROS.

wjwwood commented 2 years ago

I also ran the (micro)benchmark tests with and without this change, but didn't see a difference:

william@markook-ubuntu-2204:~/ros2_ws$ ./build/rcl_logging_spdlog/benchmark_logging_interface
2022-11-08T18:00:15-08:00
Running ./build/rcl_logging_spdlog/benchmark_logging_interface
Run on (8 X 48 MHz CPU s)
Load Average: 0.05, 0.20, 0.27
***WARNING*** Library was built as DEBUG. Timings may be affected.
--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
LoggingBenchmarkPerformance/log_level_hit         2486 ns         2485 ns       385499
LoggingBenchmarkPerformance/log_level_miss        79.4 ns         79.4 ns      8576005
PerformanceTest/logging_reinitialize              22.7 ns         22.7 ns     30958001
PerformanceTest/logging_initialize_shutdown      82145 ns        57918 ns        11809

<rebuilt on rolling here>

william@markook-ubuntu-2204:~/ros2_ws$ ./build/rcl_logging_spdlog/benchmark_logging_interface
2022-11-08T18:06:02-08:00
Running ./build/rcl_logging_spdlog/benchmark_logging_interface
Run on (8 X 48 MHz CPU s)
Load Average: 0.00, 0.07, 0.18
***WARNING*** Library was built as DEBUG. Timings may be affected.
--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
LoggingBenchmarkPerformance/log_level_hit         2626 ns         2625 ns       293706
LoggingBenchmarkPerformance/log_level_miss        80.6 ns         80.6 ns      8780122
PerformanceTest/logging_reinitialize              23.0 ns         23.0 ns     30313922
PerformanceTest/logging_initialize_shutdown      13809 ns        13773 ns        51663

Though it was with a debug build and I'm not 100% sure the benchmarks cover all cases, or cases where this would impact performance (rapid error messages being logged).

wjwwood commented 2 years ago

CI:

wjwwood commented 2 years ago

Working on the doc changes (release notes, etc).

clalancette commented 2 years ago

With the latest changes in place, this is functioning like I expect. So I decided to do some testing.

In particular, I went back and repeated the test where I modified the talker to publish and print every 1 millisecond.

To start with, I wanted to measure the overhead of running the timer and publisher at this frequency, without any overhead from the RCLCPP_INFO call. So I ran the talker like this:

ros2 run demo_nodes_cpp talker --ros-args --disable-rosout-logs --disable-stdout-logs --disable-external-lib-logs

(that is, disable output to /rosout, stdout, and the external library which is spdlog in this case). With that setup, the talker takes about 38% CPU.

I then reenabled just the external library (the spdlog logger), by running:

ros2 run demo_nodes_cpp talker --ros-args --disable-rosout-logs --disable-stdout-logs

Now the talker is taking about 41% CPU.

Next I went back and checked out rcl_logging before this PR. I ran the same experiment without this PR in place, and the numbers I got were 39% CPU for no output, and 42% CPU with just the external library output enabled.

I should note that all of this was on my Raspberry Pi 4 with a slow SD card.

Thus, I'm more confident that this change should have minimal overhead. As far as I'm concerned, this is good to go in once @scpeters concerns have been addressed.

wjwwood commented 2 years ago

Thanks for the extra testing @clalancette!

wjwwood commented 2 years ago

CI:

wjwwood commented 2 years ago

Docs update: https://github.com/ros2/ros2_documentation/pull/3193

wjwwood commented 2 years ago

I made a release for this: https://github.com/ros/rosdistro/pull/35370