gazebosim / gz-sim

Open source robotics simulator. The latest version of Gazebo.
https://gazebosim.org
Apache License 2.0
621 stars 251 forks source link

Current performance of ignition robotic libraries #20

Open osrf-migration opened 5 years ago

osrf-migration commented 5 years ago

Original report (archived issue) by Diego Ferigo (Bitbucket: dgferigo).

The original report had attachments: cachegrind.png, callgrind_2.png, callgrind_5.png


Summary

The overhead introduced by ignition gazebo seems significant. The time spent to compute physics is a small fraction of the total time.

Description

First off, thanks for all the work you're doing! Despite their early development status, ignition libraries are amazing.

I am currently developing an application based on ignition robotics. Starting by saying that it is still at a very early stage of development, I tried to evaluate its performance in order to catch initial bottlenecks. The test I did is a simple cart pole simulation with ignition configured in such a way that it tries to go as fast as it can. Currently, there's only one thread, and I can see that the CPU goes to 100% when the application is executing.

Referring to the attached cachegrind image, I am surprised to see that only the ~11% of the time passed inside the ignition libraries (the entry point is Server::Run) is actually spent in the physics engine (currently, dart). This means that the remaining 89% is computation overhead.

Is this behavior normal to your eyes? I saw that you performed in the past some profiling, maybe you have some insights on it.

Note: this is a headless simulation, I do not use the ign-gazebo-gui process

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


Updated the cachegrind file with a longer acquisition

osrf-migration commented 5 years ago

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


Thanks for reporting. We're currently in the process of doing some more detailed analysis of performance and eliminating bottlenecks, so this is useful information.

Just a few questions:

One thing you can try since you are not using the GUI, you can disable the SceneBroadcaster system plugin, since that is only used for the GUI updates.

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


Hi Michael, thanks for the prompt response! The project is open source, you can find it here: https://github.com/robotology/gym-ignition.

To confirm, you have disabled all sleep mechanisms in the Run loop so that it iterates as fast as possible?

I didn't add / remove any sleep wrt upstream. Did you mean anything in particular?

For my benefit can you post the world that you are using somewhere?

CartPole models, CartPole.world

Are you using any of the new level functionality currently?

I think I disabled it here

One thing you can try since you are not using the GUI, you can disable the SceneBroadcaster system plugin, since that is only used for the GUI updates.

Thank for the tip. I can actually enable / disable the GUI, and I thought that it only introduces overhead to report this issue. I can provide you another cachegrind analysis without it if needed.

osrf-migration commented 5 years ago

Original comment by Michael Carroll (Bitbucket: Michael Carroll, GitHub: mjcarroll).


Also, based on your longer log, it looks like you may be calling CreateComponent frequently in your plugin, causing the cached views to get invalidated (and potentially causing issues with physics). Can you post a snippet of your pre-update?

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


You are right, I call it many times in here (function body IgnitionRobot::Impl::getOrCreateComponent), and a component is removed every step in PhysicsPrivate::UpdateSim. I saw the component cache in the codebase, and I wrongly thought that the component itself was somehow cached.

I'll try to optimize it e.g. by setting the force to 0 in the UpdateSim stage instead of removing its component. I'll report soon updates, thanks for your help so far.

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


@mjcarroll As suggested I disabled all the GUI-related systems (broadcaster and user-commands) and commented out the GUI plugins (scene 3d, world control and world stats). The two cases represented by removing the component and zeroing its value did not differ significantly. The output of time is the following:

time column Removing the component Zeroing its value
real 0m13.390s 0m12.988s
user 0m6.054s 0m5.465s
sys 0m0.310s 0m0.300s

I did not do any statistical significant analysis, more or less every run has similar values.

However, this time the callgrind analysis is slightly better. I attached the 5% and 2% graphs. Now the CreateComponent call is not shown anymore and the cost of the physic engine became ~15.5% (data reads, cache misses, etc.). A little bit better but the overall overhead seems still significant.

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


Added callgrind files after disabling the GUI plugins and preventing the component to be removed every step.

osrf-migration commented 5 years ago

Original comment by Diego Ferigo (Bitbucket: dgferigo).


Cropped the images

chapulina commented 3 years ago

I'd like to have a completion criterion for this issue. Performance is quite broad: in theory, things can always go faster, and Gazebo has lots of features that we could focus on. The conversation seems to be focused on physics, so I propose reducing the scope of this issue to answering:

Is the overhead added by Ignition on top of a physics engine unacceptable?

This question is hard enough to answer with minimal functionality, so I propose the following assumptions to get to test the core functionality, without optional features:

Here's what shapes.sdf looks like with only the physics system using DART:

image

WorldForwardStep is basically just calling the physics engine's step function, which takes 0.111ms / 0.147ms. The overhead is 24.5% (0.036ms).

Here's what it looks like with TPE, which is much faster:

image

WorldForwardStep is much faster and takes 0.011ms / 0.066ms. The overhead is 83.3% (0.055ms).

stefanbuettner commented 3 years ago

Hey there,

recently I came across this cppcon talk Performance Matters by Emery Berger who talks about performance evaluation and something he calls "causal profiling". Since the 2021 Q2 focus is on performance and this issue (at least from the title) is about performance of the robotic libraries, I wanted to ask what you think about profiling Gazebo with coz to identify improvements possibilities.

I also wanted to ask how you plan to evaluate the changes of this Q2 roadmap item and if it might be worth considering Stabilizer to do so.

Cheers, Stefan

chapulina commented 3 years ago

causal profiling

Thank you for the tips, @stefanbuettner. A causal profiler looks really interesting to help us identify where to best focus our efforts.

We've been using Ignition Common's profiler wrapper around Remotery, that's what I used on the comment above. Our profiler API is meant to be abstract enough so we can plug other profilers, but it looks to me that the nature of a causal profiler is different enough that we shouldn't just wrap COZ_PROGRESS macros with the IGN_PROFILE macros that we already have across the libraries. We'd probably need to inject new macros into different places.

how you plan to evaluate the changes

We've been using the Remotery profiler as well as real time factor to evaluate changes as we make them. Stabilizer seems to offer a more rigorous approach that could be interesting.

francocipollone commented 2 years ago

We've been working on setting up coz profiler within ignition's workspace in order to analyze which advantages using this profiler could give us.

Context

coz profiler is based on a different paradigm in comparison with regular profilers. This profiler instead of giving information about the time that several parts of a code take to be executed, it gives information about "which" code is needed to be speeded up in order to speed up the entire execution. In order to achieve this, because of the way that coz's api is used and how the profiler is set up, the current structure we have in ignition-common to allow multiple Profilers implementations isn't compatible with coz.

So, before embarking into an implementation that is able to live in tandem with the ign-common::Profiler machinery, the idea is to run an ignition simulation with coz and evaluate how beneficial could be this tool and also, to examine the code and understand where are the bottlenecks during a simulation.

Setting up coz

After struggling a bit with the installation of coz and correctly linking against the coz library from ignition libraries I see value on commenting here the steps we found that are more straight forward in order to get it working.

  1. If you are using a docker container, mind running the docker run command adding CAP_SYS_ADMIN privilege. To do so you just have to add --cap-add=CAP_SYS_ADMIN as an argument to the docker run command.

Note: In case it is helpful we tried this in a container using bionic.

  1. git clone coz into your colcon workspace where your ignition source files are.
    cd colcon_ws/src
    git clone git@github.com:plasma-umass/coz.git

Note: coz provides binaries that can be installed using apt install, however we abandoned that path mainly because:

  1. Install coz dependencies:

    sudo apt-get install build-essential cmake docutils-common libelfin-dev nodejs npm python3
  2. Compile coz package

    colcon build --packages-select coz
  3. Use find_package(or the custom macro from ign-cmake) to find coz from another package, for example, from ign-gazebo package.

    find_package(coz-profiler REQUIRED HINTS "${CMAKE_INSTALL_PREFIX}/lib/cmake")

    Note1: even though coz's cmake project is called coz the generated .cmake files calls it coz-profiler, that's why the package to find is coz-profiler and not coz. This also "confuses" colcon when checking the dependencies between the packages in the src folder.

Note2: A hint must be passed to find the coz-profiler-config.cmake files given that the CMakeFiles from coz doesn't install the .cmake files under install/lib/cmake/coz as expected.

  1. A flag needs to be added to the ignition compiler in order to avoid having a DWARF error when running a simulation with the coz profiler.

So tipically:

set(CMAKE_CXX_FLAGS  "${CMAKE_CXX_FLAGS} -gdwarf-3")

Note: There isn't clear information about this issue in the coz repository, but comments of people through issues where different things are suggested to fix that. It seems that it is a bug that comes from one of the coz dependencies: libelfin

  1. When using coz within a compilation unit the header file should be included to the target. The coz.h header file is installed at "install/include" folder and not at "install/include/coz" so the directory that needs to be included is ${CMAKE_INSTALL_PREFIX}/include> example:
    
    target_include_directories(${PROJECT_LIBRARY_TARGET_NAME}
    PUBLIC
    ...
    $<BUILD_INTERFACE:${CMAKE_INSTALL_PREFIX}/include>
    )

8. Now `#include <coz.h>` can be added to your code, program, etc, as well as the `coz` macros like `COZ_PROGRESS`

9. After compiling your package under analysis(like ign-gazebo). You can run coz doing:
```bash
#remember to source install to also get the coz command
source install/setup.bash
coz run --- ./<path_to_your_executable/program/simulation>

#use sudo and the entire path to coz if needed.eg.:
sudo ./install/bin/coz run --- ./<path_to_your_executable/program/simulation>

a profiler.coz file is generated that can be loaded from https://plasma-umass.org/coz/ in order to visualize the output of the profiler.

Note: As you may noticed it, we haven't linked against coz because when doing that the executable is automatically run using coz and then you miss the chance to run your executable without coz.


Current/next steps:

chapulina commented 2 years ago

Great work, @francocipollone ! It's unfortunate that you had to jump through so many hoops to get started using coz. Looking forward to seeing what the results can tell us about where to focus in the future!

francocipollone commented 2 years ago

Some updates on this front

Setting Progress points

coz profiler lets you select one or several "progress points", these are points in the code that you would like to improve and when you run the profiler, coz inspects the code and provides information about which line or entity you have to improve in order to get better performance at the "progress points" before selected.

To start, we set a COZ_PROGRESS macro (progress point)(defined at <coz.h> header file is mandatory) at the end of the "SimulationRunner::Step" method, in order to get information about how it performs and what should improve to smooth its execution (including <coz.h> header file). The idea was to tune or adjust where to set progress points as we got more used to using the tool.

Click to see ign-gazebo/src/CMakeLists.txt ```cmake target_link_libraries(${PROJECT_LIBRARY_TARGET_NAME} PUBLIC ignition-math${IGN_MATH_VER} ignition-plugin${IGN_PLUGIN_VER}::core ignition-common${IGN_COMMON_VER}::ignition-common${IGN_COMMON_VER} ignition-common${IGN_COMMON_VER}::profiler ignition-fuel_tools${IGN_FUEL_TOOLS_VER}::ignition-fuel_tools${IGN_FUEL_TOOLS_VER} ignition-gui${IGN_GUI_VER}::ignition-gui${IGN_GUI_VER} ignition-transport${IGN_TRANSPORT_VER}::ignition-transport${IGN_TRANSPORT_VER} sdformat${SDF_VER}::sdformat${SDF_VER} protobuf::libprotobuf coz::coz dl PRIVATE ignition-plugin${IGN_PLUGIN_VER}::loader ) ```

Building

As coz documentation suggests, sources must be compiled as Debug, so we followed that recommendation when compiling ignition gazebo.(And its dependencies if needed)

Running coz

We created an executable that in essence set up a gazebo::Server, loads a sdf and run the simulation a finite number of iterations and then the program ends. Quite simple.

Then we ran coz doing:

sudo ./install/bin/coz run --- ./install/bin/simulation

While running, information is being dumped to a profiler.coz file, that later on, you can load as explained in item 9. in the previous post.

When we plot the dumped information no interesting results are given because no information about ignition files are provided. It is like if coz isn't capable of traverse ignition files.

image Only some references related to the cpp stl

During this run, coz logs information. Here we can see that coz/libcoz/inspect.cpp:509 is inspecting the dependencies files that will be used to perform the analysis,

Click to see coz logging ``` [/ws/src/coz/libcoz/libcoz.cpp:100] bootstrapping coz [/ws/src/coz/libcoz/libcoz.cpp:128] Including MAIN, which is /ws/install/bin/sensors [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/alloc_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/allocated_ptr.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/allocator.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/basic_string.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/basic_string.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/char_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/cpp_type_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/deque.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/ios_base.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/locale_classes.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/locale_facets.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/move.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/predefined_ops.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/ptr_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_automaton.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_automaton.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_compiler.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_compiler.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_constants.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_error.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_scanner.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/regex_scanner.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/shared_ptr.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/shared_ptr_base.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/std_abs.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/std_function.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_algo.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_algobase.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_construct.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_deque.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_function.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_heap.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_iterator.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_iterator_base_funcs.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_iterator_base_types.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_map.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_pair.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_stack.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_tree.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_uninitialized.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/stl_vector.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bits/vector.tcc [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/bitset [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/ext/aligned_buffer.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/ext/alloc_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/ext/atomicity.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/ext/new_allocator.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/ext/type_traits.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/iostream [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/new [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/tuple [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/c++/8/typeinfo [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/ignition/math6/ignition/math/Helpers.hh [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/ignition/math6/ignition/math/Pose3.hh [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/ignition/math6/ignition/math/Quaternion.hh [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/ignition/math6/ignition/math/Vector3.hh [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/x86_64-linux-gnu/c++/8/bits/ctype_inline.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /usr/include/x86_64-linux-gnu/c++/8/bits/gthr-default.h [/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/src/simulations/sensors.cc [/ws/src/coz/libcoz/inspect.cpp:316] Including lines from executable /ws/install/bin/sensors [/ws/src/coz/libcoz/profiler.cpp:75] Starting profiler thread [Msg] Loading SDF world file[/usr/share/ignition/ignition-gazebo3/worlds/sensors.sdf]. ```

Effectively, there are no ignition-related files being inspected (but some ignition math files )

Trying an alternative.

I've checked the coz/libcoz/inspect.cc code in charge of including source file to the analysis and it seems that it is including files referred in the compilation unit.

So tried another thing. Instead of creating an executable that links to ignition-gazebo core library I added all the sources of ignition-gazebo to the executable and created the executable again. I know that this doesn't scale and it isn't straightforward to do, but it was worth trying in order to keep digging into the tool. And this time files were inspected: (I show just some lines to avoid too much noise)

[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/EventManager.hh
[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/components/Component.hh
[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/components/Factory.hh
[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/components/ParentEntity.hh
[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/detail/ComponentStorageBase.hh
[/ws/src/coz/libcoz/inspect.cpp:509] Included source file /ws/src/ign-gazebo/include/ignition/gazebo/detail/EntityComponentManager.hh
[/ws/src/coz/libcoz/inspect.cpp:316] Including lines from executable /ws/install/bin/simulation

However, this run ended up aborted so the profiler.coz file wasn't created.

[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: peer_info.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:1164] CHECK failed: generated_database_->Add(encoded_file_descriptor, size): 
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: generated_database_->Add(encoded_file_descriptor, size): 
Aborted!

Some comments

  1. Inspection of files. Including all the sources when creating the executable doesn't scale. It was an alternative just to try ignition-gazebo but of course, we would like to have, e.g., rendering and physics being part of the analysis. So at this point, I wonder if there is something I am missing when building or related to coz in order to have more symbols exported that can be analyzed by coz.

  2. The run ended up with a failure at libprotobuf, I think this is caused by coz given that coz performs the analysis by running it several times and creating deadtimes in between at different parts of the code.


I will try to dig a bit deeper into this to see if I can get a real profile.