ethz-asl / wavemap

Fast, efficient and accurate multi-resolution, multi-sensor 3D occupancy mapping
https://ethz-asl.github.io/wavemap/
BSD 3-Clause "New" or "Revised" License
457 stars 42 forks source link

Optimize measurement integrator and ROS node performance #19

Closed victorreijgwart closed 1 year ago

victorreijgwart commented 1 year ago

Description

This PR makes wavemap 20% faster for LiDAR inputs and adds support for Tracy Profiler.

Type of change

Detailed summary

New features

Notes on Tracy

Being a hybrid profiler, Tracy combines the strengths of frame and sampling-based profiling. It also runs cross-platform, making it very useful as a strong general starting point for performance optimization. These stats can be complemented by tools like Intel's VTune or Nvidia's NSight when more detailed insights for specific platforms are desired.

To simplify the use of Tracy with ROS, we provide a wrapper that makes its library available as a catkin package. The application being profiled must link against this library. The package also includes scripts to easily build and run Tracy's GUI in Docker.

By default, Tracy is disabled and introduces zero overhead. To enable profiling with Tracy, add -DTRACY_ENABLE=ON to catkin's cmake-args and rebuild wavemap_all. To disable Tracy later, change the cmake argument to -DTRACY_ENABLE=OFF. It is also possible to remove the argument, but then you have to clean the catkin workspace (catkin clean -y) to make sure the argument doesn't stay in CMake's caches.

Necessary upgrade steps

For catkin users, the tracy_catkin package should be added to catkin_ws/src, either with

The new code can then be pulled and built as usual, with catkin build wavemap_all.

For Docker users, the only step is to pull the new image once the PR is merged and the CD pipeline publishes the new release. The image will already contain the tracy_catkin dependency.

Testing

Unit tests are present to ensure that the approximate atan2's worst-case error is always within 0.0001 degrees of std::atan2.

We manually validated that this PR improves performance without affecting accuracy. The accuracy actually slightly improves. Since we already have unit tests that ensure that all integration methods and data structures yield equivalent results, we only manually evaluate the change for the hashed chunked wavelet octree data structure and integrator.

Atan2 ROC AUC RAM usage Wall time
std::atan2 0.968 797.79 5.42
approximate 0.970 798.34 4.19

Checklist:

victorreijgwart commented 1 year ago

Looks good to me. A general question was about Tracy, is that always active or is it compile time turned on/off?

Tracy has to be enabled with a compile time CMake flag and is off by default, in which case it fully disappears.

victorreijgwart commented 1 year ago

/prepare-release minor