AravisProject / aravis

A vision library for genicam based cameras
GNU Lesser General Public License v2.1
892 stars 331 forks source link

Device enumeration hangs #905

Closed Bidski closed 2 months ago

Bidski commented 5 months ago

Under certain networking conditions a call to either arv_get_n_devices or arv_get_device_serial_nbr may hang indefinitely. I am not sure which one is actually hanging, but I suspect it is arv_get_n_devices.

I have 5 cameras in my system, all of them are GigEVision cameras. My rough setup is

                                     /-> 1Gbps link -> GigEVision camera
                                    /-> 1Gbps link -> GigEVision camera
PC -> 10Gbps SFP link -> PoE Switch -> 1Gbps link -> GigEVision camera
                                    \-> 1Gbps link -> GigEVision camera
                                     \-> 1Gbps link -> GigEVision camera

It is unclear to me the exact conditions/timing that causes this to happen as I am still in the process of debugging this, but the root cause seems may be a hardware issue with either the switch or the cables, however we feel that our software should be resilient to these sorts of hardware faults.

Is there any sort of timeout mechanism currently built into either of these functions?

Platform description:

EmmanuelP commented 5 months ago

Hi,

The discovery mechanism should not take longer than 1 second. When it hangs, please try to attach gdb to the process and get a backtrace:

Bidski commented 5 months ago

The threads have been sitting in this state for approximately 15 hours

#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dc386 in arv_get_n_devices () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c5df in module::input::configure_camera(extension::Configuration const&, module::input::Camera&)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const [clone .isra.0] () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbf27cdc9 in module::input::configure_camera(extension::Configuration const&, module::input::Camera&) () from lib/libinputCamera.so
No symbol table info available.
#5  0x00007f9cbf28006f in module::input::reset_camera(module::input::CameraContext&) () from lib/libinputCamera.so
No symbol table info available.
#6  0x00007f9cbf28053a in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Watchdog<module::input::Camera, 1, std::chrono::duration<long, std::ratio<1l, 1l> > >, NUClear::dsl::word::Single>, module::input::Camera::Parse(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda(extension::Configuration const&)#2}::operator()(extension::Configuration const&) const::{lambda()#1}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#7  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#9  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#10 0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#11 0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fce66b60) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#12 0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#13 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.

Thread 9 (Thread 0x7f9c819f6640 (LWP 3941) "data_recording"):
#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dc386 in arv_get_n_devices () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c5df in module::input::configure_camera(extension::Configuration const&, module::input::Camera&)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const [clone .isra.0] () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbf27cdc9 in module::input::configure_camera(extension::Configuration const&, module::input::Camera&) () from lib/libinputCamera.so
No symbol table info available.
#5  0x00007f9cbf28006f in module::input::reset_camera(module::input::CameraContext&) () from lib/libinputCamera.so
No symbol table info available.
#6  0x00007f9cbf28053a in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Watchdog<module::input::Camera, 1, std::chrono::duration<long, std::ratio<1l, 1l> > >, NUClear::dsl::word::Single>, module::input::Camera::Parse(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda(extension::Configuration const&)#2}::operator()(extension::Configuration const&) const::{lambda()#1}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#7  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
--Type <RET> for more, q to quit, c to continue without paging--c
#9  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#10 0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#11 0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fce662a0) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#12 0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#13 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.

Thread 8 (Thread 0x7f9c821f7640 (LWP 3940) "data_recording"):
#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dc386 in arv_get_n_devices () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c5df in module::input::configure_camera(extension::Configuration const&, module::input::Camera&)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const [clone .isra.0] () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbf27cdc9 in module::input::configure_camera(extension::Configuration const&, module::input::Camera&) () from lib/libinputCamera.so
No symbol table info available.
#5  0x00007f9cbf28006f in module::input::reset_camera(module::input::CameraContext&) () from lib/libinputCamera.so
No symbol table info available.
#6  0x00007f9cbf28053a in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Watchdog<module::input::Camera, 1, std::chrono::duration<long, std::ratio<1l, 1l> > >, NUClear::dsl::word::Single>, module::input::Camera::Parse(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda(extension::Configuration const&)#2}::operator()(extension::Configuration const&) const::{lambda()#1}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#7  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#9  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#10 0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#11 0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fce906f0) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#12 0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#13 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.

Thread 7 (Thread 0x7f9c841fb640 (LWP 3936) "data_recording"):
#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dc386 in arv_get_n_devices () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c5df in module::input::configure_camera(extension::Configuration const&, module::input::Camera&)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const [clone .isra.0] () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbf27cdc9 in module::input::configure_camera(extension::Configuration const&, module::input::Camera&) () from lib/libinputCamera.so
No symbol table info available.
#5  0x00007f9cbf28006f in module::input::reset_camera(module::input::CameraContext&) () from lib/libinputCamera.so
No symbol table info available.
#6  0x00007f9cbf28053a in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Watchdog<module::input::Camera, 1, std::chrono::duration<long, std::ratio<1l, 1l> > >, NUClear::dsl::word::Single>, module::input::Camera::Parse(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda(extension::Configuration const&)#2}::operator()(extension::Configuration const&) const::{lambda()#1}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#7  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#9  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#10 0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#11 0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fce3da00) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#12 0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#13 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.

Thread 6 (Thread 0x7f9c849fc640 (LWP 3935) "data_recording"):
#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dc386 in arv_get_n_devices () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c5df in module::input::configure_camera(extension::Configuration const&, module::input::Camera&)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const [clone .isra.0] () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbf27cdc9 in module::input::configure_camera(extension::Configuration const&, module::input::Camera&) () from lib/libinputCamera.so
No symbol table info available.
#5  0x00007f9cbf28006f in module::input::reset_camera(module::input::CameraContext&) () from lib/libinputCamera.so
No symbol table info available.
#6  0x00007f9cbf28053a in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Watchdog<module::input::Camera, 1, std::chrono::duration<long, std::ratio<1l, 1l> > >, NUClear::dsl::word::Single>, module::input::Camera::Parse(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda(extension::Configuration const&)#2}::operator()(extension::Configuration const&) const::{lambda()#1}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#7  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#9  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#10 0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#11 0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fce3d7c0) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#12 0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#13 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.

Thread 4 (Thread 0x7f9c859fe640 (LWP 3933) "data_recording"):
#0  0x00007f9cb7b0b30d in syscall () from /usr/lib/libc.so.6
No symbol table info available.
#1  0x00007f9cbde42c4c in ?? () from /usr/local/lib/libglib-2.0.so.0
No symbol table info available.
#2  0x00007f9cbe0dca30 in arv_shutdown () from /usr/local/lib/libaravis-0.8.so.0
No symbol table info available.
#3  0x00007f9cbf27c46e in std::_Function_handler<void (NUClear::threading::Task<NUClear::threading::Reaction>&), NUClear::util::CallbackGenerator<NUClear::dsl::Parse<NUClear::dsl::word::Shutdown>, module::input::Camera::Camera(std::unique_ptr<NUClear::Environment, std::default_delete<NUClear::Environment> >)::{lambda()#4}>::operator()(NUClear::threading::Reaction&)::{lambda(NUClear::threading::Task<NUClear::threading::Reaction>&)#1}>::_M_invoke(std::_Any_data const&, NUClear::threading::Task<NUClear::threading::Reaction>&) () from lib/libinputCamera.so
No symbol table info available.
#4  0x00007f9cbfb4758f in std::_Function_handler<void (), NUClear::PowerPlant::submit(std::unique_ptr<NUClear::threading::Task<NUClear::threading::Reaction>, std::default_delete<NUClear::threading::Task<NUClear::threading::Reaction> > >&&, bool const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#5  0x00007f9cbfb4c69c in NUClear::threading::TaskScheduler::run_task(NUClear::threading::TaskScheduler::Task&&) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#6  0x00007f9cbfb4eefb in NUClear::threading::TaskScheduler::pool_func(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>) () from lib/libsupportSignalCatcher.so
No symbol table info available.
#7  0x00007f9cbfb4f4f5 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (NUClear::threading::TaskScheduler::*)(std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue>), NUClear::threading::TaskScheduler*, std::shared_ptr<NUClear::threading::TaskScheduler::PoolQueue> > > >::_M_run() () from lib/libsupportSignalCatcher.so
No symbol table info available.
#8  0x00007f9cb7ed6183 in std::execute_native_thread_routine (__p=0x55e1fc980e10) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
        __t = <optimized out>
#9  0x00007f9cb7a8c54d in ?? () from /usr/lib/libc.so.6
No symbol table info available.
#10 0x00007f9cb7b11874 in clone () from /usr/lib/libc.so.6
No symbol table info available.
EmmanuelP commented 4 months ago

It looks like all the threads are waiting trying to lock arv_system_mutex, but I'm not sure. Could try to capture the backtrace with the debug symbols enabled. The backtrace should report the source line calling each function.

Bidski commented 4 months ago

I have tried compiling aravis, glib2, and my own binary as RelWithDebInfo (or debugoptimized). Unfortunately, I now get a segmentation fault when calling ARV_IS_DEVICE when checking to see if the camera device (ArvDevice) object is still valid before calling g_object_unref on it.

The pointer itself is not-null (not 0), so the memory address must now be invalid for the current process to access

--- SNIPPET FROM GDB OBTAINED FROM coredumpctl ---
#24928 0x00007efcdb20309a in ARV_IS_DEVICE (ptr=0x5614fc882e70) at /usr/local/include/aravis-0.8/arvdevice.h:74
        __inst = 0x5614fc882e70
        __t = 94648116844896
        __r = <optimized out>

What is happening here is my code has detected that it can't access one of the cameras any more (usually because we have not received a valid frame from the camera in some time) so it is tearing down whatever is remaining of its current connection to the camera in preparation for setting up a new connection to the camera. In part of doing this it is calling ARV_IS_XXX on each of the objects (ArvCamera, ArvStream, Arvdevice) and then calling g_object_unref on each of them. My binary would then wait for a period of time (multiple seconds) before trying to reestablish a connection to the camera.

Bidski commented 3 months ago

I have reverted to the release build of both Aravis and GLib since the segmentation faults were just uninformative. I am noticing that when the network drops I typically see a g_object_unref: assertion 'old_ref > 0' failed message. I am thinking this may be related to the segmentation faults.

I also saw this today file ../glib-2.72.3/glib/gthread-posix.c: line 1369 (g_system_thread_wait): error 'Resource deadlock avoided' during 'pthread_join (pt->system_thread, NULL)' which was then followed by a SIGTRAP.

EmmanuelP commented 2 months ago

I will not be able to debug your issue by just guessing what you are doing. Please reduce your code to the simplest form exhibiting your issue and post it here.

I'm closing the issue for now. Don't hesitate to reopen it once you have the requested information.

Thanks.