gazebosim / gazebo-classic

Gazebo classic. For the latest version, see https://github.com/gazebosim/gz-sim
http://classic.gazebosim.org/
Other
1.17k stars 477 forks source link

Flaky segfault in UNIT_DataLogger_TEST #1934

Open osrf-migration opened 8 years ago

osrf-migration commented 8 years ago

Original report (archived issue) by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


I've seen an identical backtrace in UNIT_DataLogger_TEST two nights in a row, though with slightly different console output (full details here). It was found on the default branch (cec393b3dbdbf38236cdc5969590cddc09282e65).

The test fails during the StressTest, while randomly toggling the record button (DataLogger_TEST.cc:188). The segfault comes from the LogWorker thread at World.cc:2585-2586, which is calling WorldState::operator-:

#4  0x00002afb2791f804 in qFatal(char const*, ...) () from /usr/lib/x86_64-linux-gnu/libQtCore.so.4
#5  <signal handler called>
#6  0x00002afb2a0cfb63 in std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00002afb27feff4c in operator++ (this=<synthetic pointer>) at /usr/include/c++/4.8/bits/stl_tree.h:270
#8  gazebo::physics::WorldState::operator- (this=0x2afb54bb6678, _state=...) at gazebo/physics/WorldState.cc:392
#9  0x00002afb27fde165 in gazebo::physics::World::LogWorker (this=0x2afb54710c40) at gazebo/physics/World.cc:2586
osrf-migration commented 8 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


the following patch makes this test fail every time:

diff -r 5e1b4aafa300d6d92a2b1bd2a71bf2b6273c566c gazebo/physics/WorldState.cc
--- a/gazebo/physics/WorldState.cc  Thu Aug 25 15:55:30 2016 -0700
+++ b/gazebo/physics/WorldState.cc  Wed Sep 07 18:07:57 2016 -0700
@@ -26,6 +26,7 @@

 #include "gazebo/common/Console.hh"
 #include "gazebo/common/Exception.hh"
+#include "gazebo/common/Time.hh"
 #include "gazebo/physics/World.hh"
 #include "gazebo/physics/Model.hh"
 #include "gazebo/physics/Light.hh"
@@ -353,6 +354,7 @@
   for (ModelState_M::const_iterator iter =
        _state.modelStates.begin(); iter != _state.modelStates.end(); ++iter)
   {
+    common::Time::MSleep(1);
     if (this->HasModelState(iter->second.GetName()))
     {
       ModelState state = this->GetModelState(iter->second.GetName()) -
@@ -372,6 +374,7 @@
   // Subtract the light states.
   for (const auto &light : _state.lightStates)
   {
+    common::Time::MSleep(1);
     if (this->HasLightState(light.second.GetName()))
     {
       LightState state = this->GetLightState(light.second.GetName()) -
@@ -392,6 +395,7 @@
   for (ModelState_M::const_iterator iter =
        this->modelStates.begin(); iter != this->modelStates.end(); ++iter)
   {
+    common::Time::MSleep(1);
     if (!_state.HasModelState(iter->second.GetName()) && this->world)
     {
       ModelPtr model = this->world->GetModel(iter->second.GetName());
@@ -403,6 +407,7 @@
   // Add in the new light states
   for (const auto &light : this->lightStates)
   {
+    common::Time::MSleep(1);
     if (!_state.HasLightState(light.second.GetName()) && this->world)
     {
       LightPtr lightPtr = this->world->Light(light.second.GetName());
osrf-migration commented 8 years ago

Original comment by Steve Peters (Bitbucket: Steven Peters, GitHub: scpeters).


the following seems to help but doesn't completely fix it

diff -r 5e1b4aafa300d6d92a2b1bd2a71bf2b6273c566c gazebo/physics/World.cc
--- a/gazebo/physics/World.cc   Thu Aug 25 15:55:30 2016 -0700
+++ b/gazebo/physics/World.cc   Wed Sep 07 18:12:41 2016 -0700
@@ -2397,6 +2397,7 @@
     }

     // Clear everything.
+    boost::mutex::scoped_lock lockStates(this->dataPtr->logMutex);
     this->dataPtr->states[0].clear();
     this->dataPtr->states[1].clear();
     this->dataPtr->stateToggle = 0;
osrf-migration commented 8 years ago

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).