luxonis / depthai-core

DepthAI C++ Library
MIT License
231 stars 126 forks source link

AddressSanitizer: heap-use-after-free #222

Open Wallbraker opened 2 years ago

Wallbraker commented 2 years ago

I'm running into a intermittent issue that asan found. I'm pretty sure I'm not at fault here, asan doesn't complain about anything else. And this issue seems to only happen with the RGB camera, not mono cameras.

This is with 2.10.0 compiled by myself.

=================================================================
==48930==ERROR: AddressSanitizer: heap-use-after-free on address 0x7fbb8369d800 at pc 0x000000466ded bp 0x7fbb86ad7a90 sp 0x7fbb86ad7250
READ of size 6220800 at 0x7fbb8369d800 thread T54
    #0 0x466dec in memmove (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x466dec)
    #1 0x76bd10 in unsigned char* std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<unsigned char>(unsigned char const*, unsigned char const*, unsigned char*) (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x76bd10)
    #2 0x7fbc565e6c9b in unsigned char* std::__copy_move_a2<false, unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_algobase.h:472:30
    #3 0x7fbc565e4107 in unsigned char* std::__copy_move_a1<false, unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_algobase.h:506:42
    #4 0x7fbc5661b044 in unsigned char* std::__copy_move_a<false, unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_algobase.h:513:31
    #5 0x7fbc56616557 in unsigned char* std::copy<unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_algobase.h:569:7
    #6 0x7fbc5661d94c in unsigned char* std::__uninitialized_copy<true>::__uninit_copy<unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_uninitialized.h:109:27
    #7 0x7fbc5661b092 in unsigned char* std::uninitialized_copy<unsigned char*, unsigned char*>(unsigned char*, unsigned char*, unsigned char*) /usr/include/c++/10/bits/stl_uninitialized.h:150:15
    #8 0x7fbc56616590 in unsigned char* std::__uninitialized_copy_a<unsigned char*, unsigned char*, unsigned char>(unsigned char*, unsigned char*, unsigned char*, std::allocator<unsigned char>&) /usr/include/c++/10/bits/stl_uninitialized.h:325:37
    #9 0x7fbc566c3b15 in void std::vector<unsigned char, std::allocator<unsigned char> >::_M_range_initialize<unsigned char*>(unsigned char*, unsigned char*, std::forward_iterator_tag) /usr/include/c++/10/bits/stl_vector.h:1585:33
    #10 0x7fbc566c04db in std::vector<unsigned char, std::allocator<unsigned char> >::vector<unsigned char*, void>(unsigned char*, unsigned char*, std::allocator<unsigned char> const&) /usr/include/c++/10/bits/stl_vector.h:657:23
    #11 0x7fbc566b3f5f in dai::StreamMessageParser::parseMessageToADatatype(streamPacketDesc_t*) /home/jakob/XR/Depth/depthai-core/build/../src/pipeline/datatype/StreamMessageParser.cpp:176:72
    #12 0x7fbc565fd5b3 in dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()::operator()() /home/jakob/XR/Depth/depthai-core/build/../src/device/DataQueue.cpp:38:80
    #13 0x7fbc56601011 in void std::__invoke_impl<void, dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()>(std::__invoke_other, dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()&&) /usr/include/c++/10/bits/invoke.h:60:36
    #14 0x7fbc56600f7b in std::__invoke_result<dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()>::type std::__invoke<dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()>(dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()&&) /usr/include/c++/10/bits/invoke.h:95:40
    #15 0x7fbc56600ed5 in void std::thread::_Invoker<std::tuple<dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/include/c++/10/thread:264:26
    #16 0x7fbc56600e7d in std::thread::_Invoker<std::tuple<dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()> >::operator()() /usr/include/c++/10/thread:271:20
    #17 0x7fbc56600e45 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool)::'lambda'()> > >::_M_run() /usr/include/c++/10/thread:215:20
    #18 0x7fbc5178dde3  (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6de3)
    #19 0x7fbc537e258f in start_thread nptl/pthread_create.c:463:8
    #20 0x7fbc51469222 in clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

0x7fbb8369d800 is located 0 bytes inside of 6220992-byte region [0x7fbb8369d800,0x7fbb83c8c4c0)
freed by thread T49 here:
    #0 0x4c896d in free (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x4c896d)
    #1 0x7fbc56713f0f in dispatcherCloseLink (/home/jakob/XR/install/lib/libdepthai-core.so+0x65af0f)

previously allocated by thread T50 here:
    #0 0x4c9687 in posix_memalign (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x4c9687)
    #1 0x7fbc567159c3 in XLinkPlatformAllocateData (/home/jakob/XR/install/lib/libdepthai-core.so+0x65c9c3)

Thread T54 created by T0 here:
    #0 0x4b399a in pthread_create (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x4b399a)
    #1 0x7fbc5178e0a8 in std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd70a8)
    #2 0x7fbc565fde12 in dai::DataOutputQueue::DataOutputQueue(std::shared_ptr<dai::XLinkConnection> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, bool) /home/jakob/XR/Depth/depthai-core/build/../src/device/DataQueue.cpp:27:26
    #3 0x7fbc565787fd in void __gnu_cxx::new_allocator<dai::DataOutputQueue>::construct<dai::DataOutputQueue, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(dai::DataOutputQueue*, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/ext/new_allocator.h:150:4
    #4 0x7fbc565780f1 in void std::allocator_traits<std::allocator<dai::DataOutputQueue> >::construct<dai::DataOutputQueue, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::allocator<dai::DataOutputQueue>&, dai::DataOutputQueue*, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/alloc_traits.h:512:17
    #5 0x7fbc56577358 in std::_Sp_counted_ptr_inplace<dai::DataOutputQueue, std::allocator<dai::DataOutputQueue>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::allocator<dai::DataOutputQueue>, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr_base.h:551:39
    #6 0x7fbc56575d48 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<dai::DataOutputQueue, std::allocator<dai::DataOutputQueue>, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(dai::DataOutputQueue*&, std::_Sp_alloc_shared_tag<std::allocator<dai::DataOutputQueue> >, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr_base.h:682:16
    #7 0x7fbc5657401f in std::__shared_ptr<dai::DataOutputQueue, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<dai::DataOutputQueue>, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::_Sp_alloc_shared_tag<std::allocator<dai::DataOutputQueue> >, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr_base.h:1371:71
    #8 0x7fbc56571738 in std::shared_ptr<dai::DataOutputQueue>::shared_ptr<std::allocator<dai::DataOutputQueue>, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::_Sp_alloc_shared_tag<std::allocator<dai::DataOutputQueue> >, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr.h:408:59
    #9 0x7fbc5656eea8 in std::shared_ptr<dai::DataOutputQueue> std::allocate_shared<dai::DataOutputQueue, std::allocator<dai::DataOutputQueue>, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::allocator<dai::DataOutputQueue> const&, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr.h:860:39
    #10 0x7fbc5656c485 in std::shared_ptr<dai::DataOutputQueue> std::make_shared<dai::DataOutputQueue, std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(std::shared_ptr<dai::XLinkConnection>&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) /usr/include/c++/10/bits/shared_ptr.h:876:42
    #11 0x7fbc565691ff in dai::Device::startPipelineImpl(dai::Pipeline const&) /home/jakob/XR/Depth/depthai-core/build/../src/device/Device.cpp:292:71
    #12 0x7fbc5657f278 in dai::DeviceBase::startPipeline(dai::Pipeline const&) /home/jakob/XR/Depth/depthai-core/build/../src/device/DeviceBase.cpp:818:29
    #13 0x6a0bca in depthai_setup_single_pipeline(depthai_fs*, depthai_camera_type) /home/jakob/XR/Monado/src/xrt/drivers/depthai/depthai_driver.cpp:396:19
    #14 0x6a0bca in depthai_fs_single_rgb /home/jakob/XR/Monado/src/xrt/drivers/depthai/depthai_driver.cpp:621:2
    #15 0x50ec49 in create_depthai /home/jakob/XR/Monado/src/xrt/state_trackers/gui/gui_scene_record.c:175:19
    #16 0x50ec49 in gui_scene_record /home/jakob/XR/Monado/src/xrt/state_trackers/gui/gui_scene_record.c:400:3
    #17 0x4fbe20 in main /home/jakob/XR/Monado/src/xrt/targets/gui/gui_sdl2_main.c:49:3
    #18 0x7fbc51378cb1 in __libc_start_main csu/../csu/libc-start.c:314:16

Thread T49 created by T0 here:
    #0 0x4b399a in pthread_create (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x4b399a)
    #1 0x7fbc56710797 in DispatcherStart (/home/jakob/XR/install/lib/libdepthai-core.so+0x657797)

Thread T50 created by T49 here:
    #0 0x4b399a in pthread_create (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x4b399a)
    #1 0x7fbc56712583 in eventSchedulerRun (/home/jakob/XR/install/lib/libdepthai-core.so+0x659583)
    #2 0x7fbc537e258f in start_thread nptl/pthread_create.c:463:8

SUMMARY: AddressSanitizer: heap-use-after-free (/home/jakob/XR/build-Monado-CMake/src/xrt/targets/gui/monado-gui+0x466dec) in memmove
Shadow bytes around the buggy address:
  0x0ff7f06cbab0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff7f06cbac0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff7f06cbad0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff7f06cbae0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff7f06cbaf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0ff7f06cbb00:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff7f06cbb10: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff7f06cbb20: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff7f06cbb30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff7f06cbb40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff7f06cbb50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==48930==ABORTING
themarpe commented 2 years ago

@Wallbraker thanks for the report. Does this issue happen when the application exits or even before reading the first message? By the looks of it, the message reading part looks okay, only freeing the data after it was copied into the message.

In this case however, it seems that XLink gets a disconnect request, which causes the freeing of the incoming data. (See freed by thread T49 here:)

How do you manage the returned Device object? It encapsulates the connection to the device and must be kept alive for the duration of the communication between the host & device.

Wallbraker commented 2 years ago

@themarpe Thank you for your reply.

I looked into it and while we use a thread to get the images, we stop the thread before deleting the device.

I added a explicit call close on the queue just before deleting the device the problem goes away. I would like to point out that none of your examples calls close on the queue so I do not expect this to be required.

themarpe commented 2 years ago

@Wallbraker sorry for going around here - I think that this actually relates to some use case "bug" which was fixed in latest develop. Can you retest without an explicit close on latest develop branch? (We'll be releasing 2.11.0 soon as well)

diablodale commented 2 years ago

hello. I've isolated scenarios where definitely parseMessageToADatatype() and possibly parseMessage() are given a non-null streamPacketDesc_t* packet But.....the struct it points to is {null, 0} This causes readIntLE() to return nonsense values (in my case it always returns 171) which then incorrectly passes the "bad packet" test and then proceeds to index into unallocated memory and other things like try to initialize a std::vector from a nullptr.

The fix is to validate parameters. At the top of both those parsexxxx functions, put

if (!packet->data || !packet->length) {
    throw std::runtime_error("Bad packet, couldn't parse");
}

This I've isolated and reproduced in MSVC. And looking at the stack in the OP it matches there also.

diablodale commented 2 years ago

@Wallbraker there are bugs in depthai regarding threads and ownership. I've found the bugs with Connection and you are right to question Device. https://github.com/luxonis/depthai-core/issues/257

Wallbraker commented 2 years ago

@themarpe I have so far not gotten this error v2.13.3 after removing the explicit queue->close(); call. Will do some more experiments and see I can catch it, and close this issue if not.

Wallbraker commented 2 years ago

Spoke to soon, just got one. Basically the same backtraces.

Wallbraker commented 2 years ago

I'm running the camera at 118 FPS, if that might help you trigger it.

diablodale commented 2 years ago

I've already found the cause of this. It is issue #257

The fail asan lists in the OP

#11 0x7fbc566b3f5f in dai::StreamMessageParser::parseMessageToADatatype

is a direct result of the bugs of the issue #257. There is no fix other than to resolve that. You can workaround the problems using the branch I list in the issue. But it will only work on Windows because Linux doesn't have a way to catch SEGFAULTs while also maintaining C++ stack unwinds.

Wallbraker commented 2 years ago

Ah thank you for the information, I'm currently working around it by closing the queue.