spencer-project / spencer_people_tracking

Multi-modal ROS-based people detection and tracking framework for mobile robots developed within the context of the EU FP7 project SPENCER.
http://www.spencer.eu/
660 stars 327 forks source link

srl_nearest_neighbor_tracker crashes #19

Closed kotaweav closed 7 years ago

kotaweav commented 7 years ago

When trying to fuse RGBD and LIDAR, srl_nearest_neighbor_tracker crashes:

nnt_node: /usr/include/boost/smart_ptr/shared_ptr.hpp:648: typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >; typename boost::detail::sp_member_access<T>::type = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >*]: Assertion `px != 0' failed.
[spencer/perception_internal/people_tracking/srl_nearest_neighbor_tracker-55] process has died [pid 3075, exit code -6, cmd /home/kota/spencer_kinetic_ws/devel/lib/srl_nearest_neighbor_tracker/nnt_node __name:=srl_nearest_neighbor_tracker __log:=/home/kota/.ros/log/b2c33872-034d-11e7-bd4e-64006a58bea8/spencer-perception_internal-people_tracking-srl_nearest_neighbor_tracker-55.log].
log file: /home/kota/.ros/log/b2c33872-034d-11e7-bd4e-64006a58bea8/spencer-perception_internal-people_tracking-srl_nearest_neighbor_tracker-55*.log
[spencer/perception_internal/people_detection/rgbd_front_top/upper_body_detector-31] process has died [pid 2727, exit code -11, cmd /home/kota/spencer_kinetic_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/kota/.ros/log/b2c33872-034d-11e7-bd4e-64006a58bea8/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-31.log].
log file: /home/kota/.ros/log/b2c33872-034d-11e7-bd4e-64006a58bea8/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-31*.log

The log file is empty. The current workaround I have been using is to start up srl_nearest_neighbor_tracker manually:

rosrun srl_nearest_neighbor_tracker nnt_node __name:=srl_nearest_neighbor_tracker

I'm on Ubuntu 16.04 x86_64 with ROS Kinetic. Thanks

tlind commented 7 years ago

Could you provide a GDB stacktrace (by adding e.g. launch-prefix='xterm -e gdb -ex run --args) to srl_nearest_neighbor_tracker/launch/nnt.launch? Hard to pinpoint otherwise.

kotaweav commented 7 years ago

Oh man, sorry about that, I totally forgot to post the backtrace didn't I.

Here is the output of the program up to the crash:

[ INFO] [1489078593.943810111]: Deleting 0 duplicated tracks.
[ INFO] [1489078593.943936021]: Publishing 2 tracked persons!
[ INFO] [1489078593.974179852]: Received 35 observations.
[ INFO] [1489078593.975275551]: Occlusion manager returned 0 tracks
[ INFO] [1489078593.981312140]: Number of accepted initation candidates 75

[ INFO] [1489078593.981382595]: Deleting 0 duplicated tracks.
[ INFO] [1489078593.981484812]: Publishing 2 tracked persons!
[ INFO] [1489078593.984951106]: Received 36 observations.
[ INFO] [1489078593.985936310]: Occlusion manager returned 0 tracks
[ INFO] [1489078593.993183581]: Number of accepted initation candidates 67

[ INFO] [1489078593.993285936]: Deleting 0 duplicated tracks.
[ INFO] [1489078593.993390776]: Publishing 2 tracked persons!
[ INFO] [1489078594.018606311]: Received 38 observations.
[ INFO] [1489078594.019371343]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.026633078]: Number of accepted initation candidates 71

[ INFO] [1489078594.026734576]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.026824553]: Publishing 2 tracked persons!
[ INFO] [1489078594.062235149]: Received 34 observations.
[ INFO] [1489078594.063062375]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.069363759]: Number of accepted initation candidates 69

[ INFO] [1489078594.069461929]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.069567768]: Publishing 2 tracked persons!
[ INFO] [1489078594.073102667]: Received 36 observations.
[ INFO] [1489078594.073883891]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.080420101]: Number of accepted initation candidates 70

[ INFO] [1489078594.080515548]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.080630265]: Publishing 2 tracked persons!
[ INFO] [1489078594.096228224]: Received 34 observations.
[ INFO] [1489078594.097061389]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.102842329]: Number of accepted initation candidates 69

[ INFO] [1489078594.102918925]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.103053065]: Publishing 2 tracked persons!
[ INFO] [1489078594.130234309]: Received 33 observations.
[ INFO] [1489078594.131170087]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.137249535]: Number of accepted initation candidates 65

[ INFO] [1489078594.137321892]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.137419605]: Publishing 2 tracked persons!
[ INFO] [1489078594.190050428]: Received 39 observations.
[ INFO] [1489078594.191098808]: Occlusion manager returned 0 tracks
[ INFO] [1489078594.198209911]: Number of accepted initation candidates 68

[ INFO] [1489078594.198300735]: Deleting 0 duplicated tracks.
[ INFO] [1489078594.198399955]: Publishing 2 tracked persons!
[ INFO] [1489078594.201453986]: Received 36 observations.
[ INFO] [1489078594.202242961]: Occlusion manager returned 0 tracks
nnt_node: /usr/include/boost/smart_ptr/shared_ptr.hpp:648: typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >; typename boost::detail::sp_member_access<T>::type = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >*]: Assertion `px != 0' failed.

Thread 1 "nnt_node" received signal SIGABRT, Aborted.
0x00007ffff5b4d428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

and the backtrace:

Reading symbols from /home/kota/spencer_kinetic_ws/devel/lib/srl_nearest_neighbor_tracker/nnt_node...done.
Starting program: /home/kota/spencer_kinetic_ws/devel/lib/srl_nearest_neighbor_tracker/nnt_node __name:=srl_nearest_neighbor_tracker __log:=/home/kota/.ros/log/f0bc9330-04e7-11e7-bd4e-64006a58bea8/spencer-perception_internal-people_tracking-srl_nearest_neighbor_tracker-55.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff06a6700 (LWP 16387)]
[New Thread 0x7fffefea5700 (LWP 16388)]
[New Thread 0x7fffef6a4700 (LWP 16389)]
[New Thread 0x7fffeeea3700 (LWP 16394)]
[New Thread 0x7fffee6a2700 (LWP 16407)]

Thread 1 "nnt_node" received signal SIGABRT, Aborted.
0x00007ffff5b4d428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
54  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
#0  0x00007ffff5b4d428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff5b4f02a in __GI_abort () at abort.c:89
#2  0x00007ffff5b45bd7 in __assert_fail_base (fmt=<optimized out>, 
    assertion=assertion@entry=0xcf3cac "px != 0", 
    file=file@entry=0xcf3c80 "/usr/include/boost/smart_ptr/shared_ptr.hpp", 
    line=line@entry=648, 
    function=function@entry=0xcf6360 <boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const>::operator->() const::__PRETTY_FUNCTION__> "typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >; typename boost::detail::sp_me"...) at assert.c:92
#3  0x00007ffff5b45c82 in __GI___assert_fail (assertion=0xcf3cac "px != 0", 
    file=0xcf3c80 "/usr/include/boost/smart_ptr/shared_ptr.hpp", line=648, 
    function=0xcf6360 <boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const>::operator->() const::__PRETTY_FUNCTION__> "typename boost::detail::sp_member_access<T>::type boost::shared_ptr<T>::operator->() const [with T = const spencer_tracking_msgs::DetectedPersons_<std::allocator<void> >; typename boost::detail::sp_me"...) at assert.c:101
#4  0x0000000000a9e9f3 in boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const>::operator-> (this=0x11865e8)
    at /usr/include/boost/smart_ptr/shared_ptr.hpp:648
#5  0x0000000000cf0b97 in srl_nnt::LowConfidenceObservationsRecovery::recoverObservations (this=0x1186440, currentTime=1489077995.8630443, 
    unmatchedTracks=std::vector of length 1, capacity 1 = {...}, 
    matchedTracks=std::vector of length 2, capacity 2 = {...}, 
    recoveredObservations=std::vector of length 0, capacity 0)
    at /home/kota/spencer_kinetic_ws/src/spencer_people_tracking/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/missed_observation_recovery/low_confidence_observations_recovery.cpp:81
#6  0x0000000000b7dc6f in srl_nnt::NearestNeighborTracker::processCycle (
    this=0x7fffffffd030, currentTime=1489077995.8630443, 
    newObservations=std::vector of length 36, capacity 64 = {...})
    at /home/kota/spencer_kinetic_ws/src/spencer_people_tracking/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/nearest_neighbor_tracker.cpp:169
#7  0x0000000000b12c60 in srl_nnt::ROSInterface::incomingObservations (
    this=0x7fffffffd7b0, detectedPersons=...)
    at /home/kota/spencer_kinetic_ws/src/spencer_people_tracking/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/ros/ros_interface.cpp:110
#8  0x0000000000b38a4d in boost::_mfi::mf1<void, srl_nnt::ROSInterface, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >::operator() (this=0x117c3d8, p=0x7fffffffd7b0, a1=...)
    at /usr/include/boost/bind/mem_fn_template.hpp:165
#9  0x0000000000b36b5a in boost::_bi::list2<boost::_bi::value<srl_nnt::ROSInterface*>, boost::arg<1> >::operator()<boost::_mfi::mf1<void, srl_nnt::ROSInterface, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >, boost::_bi::list1<boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> const&> > (this=0x117c3e8, f=..., a=...)
    at /usr/include/boost/bind/bind.hpp:313
#10 0x0000000000b34697 in boost::_bi::bind_t<void, boost::_mfi::mf1<void, srl_nnt::ROSInterface, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >, boost::_bi::list2<boost::_bi::value<srl_nnt::ROSInterface*>, boost::arg<1> > >::operator()<boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> > (this=0x117c3d8, a1=...)
    at /usr/include/boost/bind/bind_template.hpp:47
#11 0x0000000000b31a11 in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, srl_nnt::ROSInterface, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >, boost::_bi::list2<boost::_bi::value<srl_nnt::ROSInterface*>, boost::arg<1> > >, void, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> const&>::invoke (function_obj_ptr=..., a0=...)
    at /usr/include/boost/function/function_template.hpp:159
#12 0x0000000000b3a106 in boost::function1<void, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> const&>::operator() (
    this=0x117c3d0, a0=...)
    at /usr/include/boost/function/function_template.hpp:773
#13 0x0000000000b38ce1 in boost::detail::function::void_function_obj_invoker1<boost::function<void (boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> const&)>, void, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >::invoke(boost::detail::function::function_buffer&, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const>) (function_obj_ptr=..., a0=...)
    at /usr/include/boost/function/function_template.hpp:159
#14 0x0000000000b3eecd in boost::function1<void, boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> >::operator() (
    this=0x117c358, a0=...)
    at /usr/include/boost/function/function_template.hpp:773
#15 0x0000000000b3e419 in ros::SubscriptionCallbackHelperT<boost::shared_ptr<spencer_tracking_msgs::DetectedPersons_<std::allocator<void> > const> const&, void>::call (this=0x117c350, params=...)
    at /opt/ros/kinetic/include/ros/subscription_callback_helper.h:144
#16 0x00007ffff733d5cd in ros::SubscriptionQueue::call() ()
   from /opt/ros/kinetic/lib/libroscpp.so
#17 0x00007ffff72e7cf0 in ros::CallbackQueue::callOneCB(ros::CallbackQueue::TLS*) () from /opt/ros/kinetic/lib/libroscpp.so
#18 0x00007ffff72e90f3 in ros::CallbackQueue::callAvailable(ros::WallDuration)
    () from /opt/ros/kinetic/lib/libroscpp.so
#19 0x00007ffff7341691 in ros::SingleThreadedSpinner::spin(ros::CallbackQueue*)
    () from /opt/ros/kinetic/lib/libroscpp.so
#20 0x00007ffff732672b in ros::spin() () from /opt/ros/kinetic/lib/libroscpp.so
#21 0x0000000000b12a25 in srl_nnt::ROSInterface::spin (this=0x7fffffffd7b0)
    at /home/kota/spencer_kinetic_ws/src/spencer_people_tracking/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/ros/ros_interface.cpp:85
#22 0x0000000000a93d3c in main (argc=1, argv=0x7fffffffe0e8)
    at /home/kota/spencer_kinetic_ws/src/spencer_people_tracking/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/ros/nodes/nnt_node.cpp:21

Actually, looks like m_currentLowConfidenceDetections was being used while pointing to null. I don't know how this impacts the rest of the system but at least if I do a check for null and return it doesn't seem to crash, and based on my 20 seconds of testing also seems to work:

diff --git a/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/missed_observation_recovery/low_confidence_observations_recovery.cpp b/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/missed_observation_recovery/low_confidence_observations_recovery.cpp
index af3adbe..300619f 100644
--- a/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/missed_observation_recovery/low_confidence_observations_recovery.cpp
+++ b/tracking/people/srl_nearest_neighbor_tracker/src/srl_nearest_neighbor_tracker/missed_observation_recovery/low_confidence_observations_recovery.cpp
@@ -78,6 +78,8 @@ void LowConfidenceObservationsRecovery::recoverObservations(const double current
 {
     // Check if normal and low-confidence detections are sufficiently synchronized
     m_numCyclesTotal++;
+
+    if (!m_currentLowConfidenceDetections) return;
     double timestampDelta = currentTime - m_currentLowConfidenceDetections->header.stamp.toSec();
     if(std::abs(timestampDelta) > m_maxTimestampDifference)
     {
tlind commented 7 years ago

Yes, makes sense. What you see is a race condition when the regular detections arrive before the low-confidence detections, or if there are no low-confidence detections at all -- I guess we never tested this after refactoring the functionality into a separate class. Your fix is fine, can you open a PR for it?

kotaweav commented 7 years ago

Absolutely. Opened a PR.

By the way, this is not really the place to ask but i'm not sure where else would be better. I made some changes locally to rwth_upper_body_detector that use C++11 smart pointers (there was a segfault there as well, and the general consensus seems to be to avoid raw pointers). I don't think I saw any C++11/14 anywhere in spencer, so I figure I shouldn't do a pull request? I can switch it to using boost smart pointers instead. To make it worse these are on top of https://github.com/spencer-project/spencer_people_tracking/pull/14 which was closed due to not having the changes be isolated, so I suppose I would need to make PRs to fix some other things that rentt addresses as well.

tlind commented 7 years ago

@femtogram I just read your comment. For me, C++11 would be fine as long as the package still compiles under Ubuntu 14.04 / ROS Indigo.

However, I have a hard time following all the changes done by rentt in #14 by looking at the commit history. Especially given that some commit messages appear twice (not sure if he was undoing and redoing some changes, or what happened). Looking at the changed files however, it looks like not too many things have been changed. Maybe we can redo these changes in a somewhat more structured manner, without all the whitespace reformatting.

kotaweav commented 7 years ago

@tlind sounds great. I don't have access to the full sensor set right now (or any bag files with all the data), so I'm not in a great place to work on this today, but I'll start going through and make the appropriate changes over the next few days.

kotaweav commented 7 years ago

Actually, @tlind, my changes replaces most of those made by rentt. I'll look through to see if there are any relevant pieces that my patch doesn't cover. I just downloaded g++-4.8 (which I believe is the version in ubuntu 14.04) and made sure that everything I use is supported.

kotaweav commented 7 years ago

@tlind I'm having trouble recreating the segfaults related to the pointers I was having earlier with the latest build. I've submitted a small patch fixing an issue with indexing for the matrix, but haven't pushed the other changes since I can't test that.