mdk97 / aitrack-linux

6DoF Head tracking software
MIT License
42 stars 7 forks source link

Segfaults when applying configuration #3

Closed mattcaron closed 3 years ago

mattcaron commented 3 years ago

Describe the bug Clicking "Apply" on the config window causes a segfault

To Reproduce

  1. Start AITrack
  2. Click "Configuration"
  3. Make no changes
  4. Click "Apply"
  5. Segfault

Expected behavior I expected it to write a config file and close the window.

Environment (please complete the following information):

Additional context I'm happy to help debug / hack / etc. on this, I just wanted to get a bug report in to see if you have some ideas as to where to start.

mdk97 commented 3 years ago

Can you run the program through valgrind and post the terminal logs here? It will probably slow down the program because of the memory checks, but eventually it opens. Just run valgrind ./aitrack , go to the configuration window and click apply to force the segfault.

mattcaron commented 3 years ago

And valgrind doth spake, saying:

(matt@bluebox) ~/workspace/code/headtracking/aitrack-linux (merge-v0.6.5-alpha-ubuntu)$ LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.4.0/lib valgrind ./aitrack 
==601061== Memcheck, a memory error detector
==601061== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==601061== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==601061== Command: ./aitrack
==601061== 
--601061-- WARNING: unhandled amd64-linux syscall: 315
--601061-- You may be able to write your own handler.
--601061-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--601061-- Nevertheless we consider this a bug.  Please report
--601061-- it at http://valgrind.org/support/bug_reports.html.
Found ID: 0
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video1): can't open camera by index
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (1758) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src0 reported: Device '/dev/video1' is not a capture device.
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (888) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video1): can't open camera by index
Not found device1
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video2): can't open camera by index
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (1758) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src1 reported: Cannot identify device '/dev/video2'.
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (888) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video2): can't open camera by index
Not found device2
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video3): can't open camera by index
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (1758) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src2 reported: Cannot identify device '/dev/video3'.
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (888) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video3): can't open camera by index
Not found device3
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video4): can't open camera by index
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (1758) handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module v4l2src3 reported: Cannot identify device '/dev/video4'.
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (888) open OpenCV | GStreamer warning: unable to start pipeline
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (480) isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
[ WARN:0] global ../modules/videoio/src/cap_v4l.cpp (887) open VIDEOIO(V4L2:/dev/video4): can't open camera by index
Not found device4
WARNING: Since openmp is enabled in this build, this API cannot be used to configure intra op num threads. Please use the openmp environment variables to control the number of threads.
   REQUEST   
-1
"v0.6.5-alpha"
Disabling tracking shortcut
==601061== Invalid read of size 8
==601061==    at 0x14EE68: TrackerWrapper::update_distance_param(float) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x1345D6: Presenter::init_tracker(int) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x134D78: Presenter::save_prefs(ConfigData const&) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x154154: WindowMain::onSaveClick() (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x70822FF: QMetaObject::activate(QObject*, int, int, void**) (in /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.12.8)
==601061==    by 0x61DFBD0: ??? (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x61E0E5E: ??? (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x61E1034: QAbstractButton::mouseReleaseEvent(QMouseEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x612D2B5: QWidget::event(QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x60EAA65: QApplicationPrivate::notify_helper(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x60F4342: QApplication::notify(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x7056939: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.12.8)
==601061==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==601061== 
==601061== 
==601061== Process terminating with default action of signal 11 (SIGSEGV)
==601061==  Access not within mapped region at address 0x0
==601061==    at 0x14EE68: TrackerWrapper::update_distance_param(float) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x1345D6: Presenter::init_tracker(int) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x134D78: Presenter::save_prefs(ConfigData const&) (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x154154: WindowMain::onSaveClick() (in /home/matt/workspace/code/headtracking/aitrack-linux/aitrack)
==601061==    by 0x70822FF: QMetaObject::activate(QObject*, int, int, void**) (in /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.12.8)
==601061==    by 0x61DFBD0: ??? (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x61E0E5E: ??? (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x61E1034: QAbstractButton::mouseReleaseEvent(QMouseEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x612D2B5: QWidget::event(QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x60EAA65: QApplicationPrivate::notify_helper(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x60F4342: QApplication::notify(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.12.8)
==601061==    by 0x7056939: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.12.8)
==601061==  If you believe this happened as a result of a stack
==601061==  overflow in your program's main thread (unlikely but
==601061==  possible), you can try to increase the size of the
==601061==  main thread stack using the --main-stacksize= flag.
==601061==  The main thread stack size used in this run was 8388608.
==601061== 
==601061== HEAP SUMMARY:
==601061==     in use at exit: 16,545,377 bytes in 204,514 blocks
==601061==   total heap usage: 714,227 allocs, 509,713 frees, 74,644,611 bytes allocated
==601061== 
==601061== LEAK SUMMARY:
==601061==    definitely lost: 23,662 bytes in 29 blocks
==601061==    indirectly lost: 3,373 bytes in 137 blocks
==601061==      possibly lost: 73,680 bytes in 618 blocks
==601061==    still reachable: 16,282,654 bytes in 202,839 blocks
==601061==                       of which reachable via heuristic:
==601061==                         stdstring          : 350,118 bytes in 7,135 blocks
==601061==                         length64           : 3,720 bytes in 66 blocks
==601061==                         newarray           : 10,000 bytes in 67 blocks
==601061==                         multipleinheritance: 1,536 bytes in 1 blocks
==601061==         suppressed: 0 bytes in 0 blocks
==601061== Rerun with --leak-check=full to see details of leaked memory
==601061== 
==601061== For lists of detected and suppressed errors, rerun with: -s
==601061== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
Segmentation fault (core dumped)

Unfortunately, I don't have time to dig into this any more tonight. I'd love to fix it. Maybe this weekend if you don't get to it.

mdk97 commented 3 years ago

Access not within mapped region at address 0x0 Seems to be an attempt to access data on a null pointer. The function in which the error occurs is TrackerWrapper::update_distance_param, which tries to access a pointer to Tracker called model and subsequently a pointer to PositionSolver called solver. I suspect one of these two pointers are null for some reason. Can you confirm all the models are on the /usr/share/aitrack/models directory? It should contain the following files:

detection.onnx
lm_b.onnx
lm_f.onnx
lm_m.onnx

I ask this because the Tracker class constructor depends on the existence of these files. There is also a log at ~/.local/share/aitrack/log.txt. There could be some sort of clue there. You could also debug the application, are you familiar with debugging tools such as gdb?

mattcaron commented 3 years ago

That was the conclusion I came to in the 15 minutes I had to look at it.

All the models are in that directory, however, look at this:

(matt@bluebox) /usr/share/aitrack/models$ ls -l
total 26768
-rw------- 1 root root   568302 Sep 16 16:29 detection.onnx
-rw------- 1 root root 13500226 Sep 16 16:29 lm_b.onnx
-rw------- 1 root root  4842329 Sep 16 16:29 lm_f.onnx
-rw------- 1 root root  8494061 Sep 16 16:29 lm_m.onnx

Bet that has something to do with it...

(matt@bluebox) /usr/share/aitrack/models$ sudo chmod a+r *

and it's sorted - works great now.

It's always the foolish permissions. Sorry for the trouble.

If I have a chance, I'll try and improve this so that it complains properly rather than crashing.

For what it's worth, here's my resume: https://www.mattcaron.net/resume/

I'm not unskilled, just busy.

Thanks for your help.