stereolabs / zed-ros-wrapper

ROS wrapper for the ZED SDK
https://www.stereolabs.com/docs/ros/
MIT License
447 stars 392 forks source link

Segmentation fault when writing my own zed interfaced code #109

Closed gochaudhari closed 6 years ago

gochaudhari commented 7 years ago

Hi,

This issue is not exactly regarding the zed-ros-wrapper but it's similar to that. While I use the ZED SDK in another ROS catkin workspace and when I try to initialize the variables under the ZED camera using the init function. But I get a segmentation fault while the GPU memory is being assigned. I have taken the error dump using valgrind. You'll understand the issue better from the below dump.

ubuntu@tegra-ubuntu:~/catkin_ws$ rosrun --prefix 'valgrind --track-origins=yes' opencv_project opencv_project_node /home/ubuntu/catkin_ws/src/opencv_project/images/face.jpg kernel_window ==13012== Memcheck, a memory error detector ==13012== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==13012== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==13012== Command: /home/ubuntu/catkin_ws/devel/lib/opencv_project/opencv_project_node /home/ubuntu/catkin_ws/src/opencv_project/images/face.jpg kernel_window ==13012== --13012-- WARNING: unhandled arm64-linux syscall: 125 --13012-- You may be able to write your own handler. --13012-- Read the file README_MISSING_SYSCALL_OR_IOCTL. --13012-- Nevertheless we consider this a bug. Please report --13012-- it at http://valgrind.org/support/bug_reports.html. --13012-- WARNING: unhandled arm64-linux syscall: 126 --13012-- You may be able to write your own handler. --13012-- Read the file README_MISSING_SYSCALL_OR_IOCTL. --13012-- Nevertheless we consider this a bug. Please report --13012-- it at http://valgrind.org/support/bug_reports.html. ==13012== Use of uninitialised value of size 8 ==13012== at 0x549FFD8: ??? (in /usr/lib/aarch64-linux-gnu/libjpeg.so.8.0.2) ==13012== Uninitialised value was created by a stack allocation ==13012== at 0x549FFA4: ??? (in /usr/lib/aarch64-linux-gnu/libjpeg.so.8.0.2) ==13012== Done Init Vars ==13012== Warning: set address range perms: large range [0x100000000, 0x17cd80000) (noaccess) ==13012== Warning: set address range perms: large range [0x5c0000000, 0x63cd80000) (noaccess) Camera Init Done. ZED SDK >> (Init) Best GPU Found : NVIDIA Tegra X1 , ID : 0 ZED SDK >> (Init) Disparity mode has been set to PERFORMANCE

ZED SDK >> (Init) Creating ZED GPU mem... ==13012== Use of uninitialised value of size 8 ==13012== at 0x712CA50: cv::stereoRectify(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, cv::_InputArray const&, cv::_InputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::OutputArray const&, int, double, cv::Size, cv::Rect*, cv::Rect) (in /usr/lib/libopencv_calib3d.so.2.4.13) ==13012== by 0x4F2E01F: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F2CAAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F3B4BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x405683: main (opencv_project_node.cpp:101) ==13012== Uninitialised value was created by a stack allocation ==13012== at 0x4F2DC14: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==13012== ==13012== Invalid read of size 8 ==13012== at 0x712CA50: cv::stereoRectify(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, cv::_InputArray const&, cv::_InputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::OutputArray const&, int, double, cv::Size, cv::Rect_, cv::Rect_) (in /usr/lib/libopencv_calib3d.so.2.4.13) ==13012== by 0x4F2E01F: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F2CAAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F3B4BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x405683: main (opencv_project_node.cpp:101) ==13012== Address 0x1010000 is not stack'd, malloc'd or (recently) free'd ==13012== ==13012== ==13012== Process terminating with default action of signal 11 (SIGSEGV) ==13012== Access not within mapped region at address 0x1010000 ==13012== at 0x712CA50: cv::stereoRectify(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, cv::_InputArray const&, cv::_InputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, cv::OutputArray const&, int, double, cv::Size, cv::Rect_, cv::Rect_*) (in /usr/lib/libopencv_calib3d.so.2.4.13) ==13012== by 0x4F2E01F: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F2CAAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x4F3B4BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==13012== by 0x405683: main (opencv_project_node.cpp:101) ==13012== If you believe this happened as a result of a stack ==13012== overflow in your program's main thread (unlikely but ==13012== possible), you can try to increase the size of the ==13012== main thread stack using the --main-stacksize= flag. ==13012== The main thread stack size used in this run was 8388608. ==13012== ==13012== HEAP SUMMARY: ==13012== in use at exit: 230,803,397 bytes in 293,641 blocks ==13012== total heap usage: 362,035 allocs, 68,394 frees, 1,300,715,308 bytes allocated ==13012== ==13012== LEAK SUMMARY: ==13012== definitely lost: 3,040 bytes in 148 blocks ==13012== indirectly lost: 0 bytes in 0 blocks ==13012== possibly lost: 69,730,827 bytes in 29,920 blocks ==13012== still reachable: 161,069,530 bytes in 263,573 blocks ==13012== of which reachable via heuristic: ==13012== newarray : 1,536 bytes in 16 blocks ==13012== suppressed: 0 bytes in 0 blocks ==13012== Rerun with --leak-check=full to see details of leaked memory ==13012== ==13012== For counts of detected and suppressed errors, rerun with: -v ==13012== ERROR SUMMARY: 514 errors from 3 contexts (suppressed: 0 from 0) Killed

jinsooihm commented 7 years ago

I get the same error when zed->init() is called. This is on Jetson TX1, and I think the issue is when the package is built with image_geometry catkin package dependency even when it is not included in any of the h or cpp file. If I remove the part where image_geometry is included in CMakeLists.txt and package.xml, ZED gets initialized with no problem and other things work.

I tried adding image_geometry as dependency on zed-ros-wrapper but not including in any cpp file, and same error occurs.

dmesg gives this output. [17549.801104] tegradc tegradc.1: blank - powerdown [17581.835559] Unhandled fault: alignment fault (0x92000021) at 0x00000000eeeeeeee

and valgrind gives this output. ==29877== Memcheck, a memory error detector ==29877== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==29877== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==29877== Command: /home/ubuntu/catkin_ws/devel/lib/zed_to_laserscan/example ==29877== --29877-- WARNING: unhandled arm64-linux syscall: 125 --29877-- You may be able to write your own handler. --29877-- Read the file README_MISSING_SYSCALL_OR_IOCTL. --29877-- Nevertheless we consider this a bug. Please report --29877-- it at http://valgrind.org/support/bug_reports.html. --29877-- WARNING: unhandled arm64-linux syscall: 126 --29877-- You may be able to write your own handler. --29877-- Read the file README_MISSING_SYSCALL_OR_IOCTL. --29877-- Nevertheless we consider this a bug. Please report --29877-- it at http://valgrind.org/support/bug_reports.html. ==29877== Warning: set address range perms: large range [0x100000000, 0x17cd80000) (noaccess) ==29877== Warning: set address range perms: large range [0x5c0000000, 0x63cd80000) (noaccess) ZED SDK >> (Init) Best GPU Found : NVIDIA Tegra X1 , ID : 0 ZED SDK >> (Init) Disparity mode has been set to PERFORMANCE ZED SDK >> (Init) Creating ZED GPU mem... ==29877== Conditional jump or move depends on uninitialised value(s) ==29877== at 0xC41AE78: cv::initUndistortRectifyMap(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, int, cv::_OutputArray const&, cv::_OutputArray const&) (in /usr/lib/aarch64-linux-gnu/libopencv_imgproc.so.2.4.9) ==29877== by 0x4CA720B: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CA5AAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CB44BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x405047: main (example.cpp:35) ==29877== Uninitialised value was created by a heap allocation ==29877== at 0x4844B88: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) ==29877== ==29877== Invalid read of size 4 ==29877== at 0xC41AFD0: cv::initUndistortRectifyMap(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, int, cv::_OutputArray const&, cv::_OutputArray const&) (in /usr/lib/aarch64-linux-gnu/libopencv_imgproc.so.2.4.9) ==29877== by 0x4CA720B: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CA5AAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CB44BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x405047: main (example.cpp:35) ==29877== Address 0xeeeeeeee is not stack'd, malloc'd or (recently) free'd ==29877== ==29877== ==29877== Process terminating with default action of signal 7 (SIGBUS) ==29877== Invalid address alignment at address 0xEEEEEEEE ==29877== at 0xC41AFD0: cv::initUndistortRectifyMap(cv::_InputArray const&, cv::_InputArray const&, cv::_InputArray const&, cv::InputArray const&, cv::Size, int, cv::_OutputArray const&, cv::_OutputArray const&) (in /usr/lib/aarch64-linux-gnu/libopencv_imgproc.so.2.4.9) ==29877== by 0x4CA720B: sl::zed::Analyser::CreateMxMyWithOriginalValues() (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CA5AAB: sl::zed::Rectifier_gpu::Rectifier_gpu(NppiSize, NppiSize, sl::zed::StereoParameters&, int, bool, bool) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x4CB44BF: sl::zed::Camera::init(sl::zed::InitParams&) (in /usr/local/zed/lib/libsl_zed.so) ==29877== by 0x405047: main (example.cpp:35) ==29877== ==29877== HEAP SUMMARY: ==29877== in use at exit: 277,938,596 bytes in 290,979 blocks ==29877== total heap usage: 361,016 allocs, 70,037 frees, 1,935,064,789 bytes allocated ==29877== ==29877== LEAK SUMMARY: ==29877== definitely lost: 3,040 bytes in 148 blocks ==29877== indirectly lost: 0 bytes in 0 blocks ==29877== possibly lost: 82,831,641 bytes in 29,716 blocks ==29877== still reachable: 195,103,915 bytes in 261,115 blocks ==29877== of which reachable via heuristic: ==29877== newarray : 1,536 bytes in 16 blocks ==29877== suppressed: 0 bytes in 0 blocks ==29877== Rerun with --leak-check=full to see details of leaked memory ==29877== ==29877== For counts of detected and suppressed errors, rerun with: -v ==29877== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) Killed

gochaudhari commented 7 years ago

Hi @jinsoo960 . I was able to solve this problem by some other method. My valgrind dump shows that there is some problem with one of the opencv function definition mismatch.

I was using opencv and cv_bridge as the dependencies. When I removed cv_bridge as the dependency, all the errors were resolved because somewhere in the init function, it began accessing the function in opencv dependency.

May be one of your functions being dependent on image_geometry.

jinsooihm commented 7 years ago

Hello, @gochaudhari . Thank you for your reply. I found out that the issue was that cv_bridge and image_geometry both use opencv3 as their dependency, while ZED with TX1 uses opencv2. So I downloaded the source code from the repository of image_geometry (which is actually the same repository as cv_bridge), changed the dependency to opencv from opencv3 in the package.xml file and now it works.