centreon / centreon-collect

Centreon collect software collection
10 stars 5 forks source link

[23.10] Segmentation Fault in cbd Due to Null Pointer Dereference and Invalid Mutex Operation #1190

Open tphakala opened 4 months ago

tphakala commented 4 months ago

BUG REPORT INFORMATION

Prerequisites

Versions

centreon-broker-core-23.10.2-1.el9.x86_64
centreon-broker-cbd-23.10.2-1.el9.x86_64
centreon-broker-core-23.10.5-1.el9.x86_64
centreon-broker-cbd-23.10.5-1.el9.x86_64

Operating System

Red Hat Enterprise Linux release 9.3 (Plow)

How the the component has been installed and versions

Version: --

Additional environment details (AWS, VirtualBox, physical, etc.):

VMware vSphere virtual machine

Description

CBD crashes on reload

# coredumpctl list
TIME                            PID UID GID SIG     COREFILE EXE             SIZE
Mon 2024-03-04 16:31:24 EET   12546 692 692 SIGSEGV present  /usr/sbin/cbd 234.0M
Tue 2024-03-05 10:10:08 EET 3596855 692 692 SIGSEGV present  /usr/sbin/cbd 126.1M
Tue 2024-03-05 12:12:04 EET 3968487 692 692 SIGSEGV present  /usr/sbin/cbd  28.4M
Tue 2024-03-05 15:01:05 EET 4012819 692 692 SIGSEGV present  /usr/sbin/cbd  29.3M
Tue 2024-03-05 15:59:35 EET 4076381 692 692 SIGSEGV present  /usr/sbin/cbd  45.4M
Tue 2024-03-05 16:16:21 EET 4104075 692 692 SIGSEGV present  /usr/sbin/cbd  35.6M
Wed 2024-03-06 11:28:38 EET 4111977 692 692 SIGSEGV present  /usr/sbin/cbd 131.5M

We have encountered a consistent segmentation fault across multiple instances of the cbd process, as evidenced by analysis of various core dumps generated by the application. The fault appears to be triggered during mutex lock operations within the threading and synchronization logic of the application. Below are the key observations from the gdb backtrace analysis of the core dumps:

Crash Context: The segmentation fault occurs at the point of attempting to lock a mutex (pthread_mutex_lock@@GLIBC_2.2.5) within the GNU C Library (libc.so.6).

Invalid Mutex Reference: The mutex operation attempts to lock a mutex at an invalid memory address (this=0x1b8). Such an address is highly indicative of a corrupted, uninitialized, or otherwise invalid mutex object, pointing towards a problem with the management of synchronization primitives in the application.

Null Pointer Dereference: The backtrace reveals that the crash happens during a call to com::centreon::broker::multiplexing::muxer::publish with a this pointer being null (this=this@entry=0x0). This indicates a scenario where the application attempts to access a member function of a class through a null pointer, leading to undefined behavior and, ultimately, a segmentation fault.

Recurring Pattern: The same pattern of crash is observed across different core dumps, suggesting a systematic issue with the application's handling of threading and synchronization, particularly regarding the lifecycle and integrity of muxer objects and associated mutexes.

Thread Safety Concerns: The crashes are related to threading operations, indicating potential issues with the thread safety of operations involving muxer objects or related synchronization mechanisms.

Steps to Reproduce

Export poller configuration from Central

Describe the received result

Some times cbd process crashes on segfault

Describe the expected result

cbd process should not crash on segfault

Logs

gdb backtrace

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/cbd /etc/centreon-broker/central-broker.json'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f9ec1ca2b14 in pthread_mutex_lock@@GLIBC_2.2.5 () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f9ec1bff640 (LWP 4111978))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-83.el9_3.7.x86_64 gnutls-3.7.6-23.el9_3.3.x86_64 libffi-3.4.2-8.el9.x86_64 libgcc-11.4.1-2.1.el9.x86_64 libidn2-2.3.0-7.el9.x86_64 libstdc++-11.4.1-2.1.el9.x86_64 libtasn1-4.16.0-8.el9_1.x86_64 libunistring-0.9.10-15.el9.x86_64 nettle-3.8-3.el9_0.x86_64 p11-kit-0.24.1-2.el9.x86_64 sssd-client-2.9.1-4.el9_3.5.x86_64
(gdb) bt
#0  0x00007f9ec1ca2b14 in pthread_mutex_lock@@GLIBC_2.2.5 () from /lib64/libc.so.6
#1  0x00000000009a3c54 in __gthread_mutex_lock (__mutex=0x1b8) at /usr/include/c++/11/x86_64-redhat-linux/bits/gthr-default.h:749
#2  std::mutex::lock (this=0x1b8) at /usr/include/c++/11/bits/std_mutex.h:100
#3  std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/11/bits/std_mutex.h:229
#4  com::centreon::broker::multiplexing::muxer::publish (this=this@entry=0x0, event_queue=std::deque with 1 element = {...}) at broker/core/multiplexing/src/muxer.cc:309
#5  0x000000000099bae6 in operator() (__closure=0x7f9ec1bfe9c0) at broker/core/multiplexing/src/engine.cc:404
#6  boost::asio::asio_handler_invoke<com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()> > (function=...)
    at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/handler_invoke_hook.hpp:88
#7  boost_asio_handler_invoke_helpers::invoke<com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()>, com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()> > (context=..., function=...) at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/handler_invoke_helpers.hpp:54
#8  boost::asio::detail::handler_work<com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()>, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0>, void>::complete<com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()> > (handler=..., function=..., this=<synthetic pointer>)
    at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/handler_work.hpp:524
#9  boost::asio::detail::completion_handler<com::centreon::broker::multiplexing::engine::_send_to_subscribers(com::centreon::broker::multiplexing::engine::send_to_mux_callback_type&&)::<lambda()>, boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> >::do_complete(void *, boost::asio::detail::operation *, const boost::system::error_code &, std::size_t) (owner=0x30b5750, base=0x7f9eb1184fe0) at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/completion_handler.hpp:74
#10 0x0000000000815827 in boost::asio::detail::scheduler_operation::complete (bytes_transferred=0, ec=..., owner=0x30b5750, this=0x7f9eb1184fe0)
    at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/scheduler_operation.hpp:40
#11 boost::asio::detail::scheduler::do_run_one (this=0x30b5750, lock=..., this_thread=..., ec=...) at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/impl/scheduler.ipp:493
#12 0x00000000009623a1 in boost::asio::detail::scheduler::run (this=0x30b5750, ec=...) at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/detail/impl/scheduler.ipp:210
#13 0x000000000096646f in boost::asio::io_context::run (this=<optimized out>, this=<optimized out>) at /root/.conan/data/boost/1.82.0/_/_/package/20331109885a8c762c1205ed07e58ba4848e8024/include/boost/asio/impl/io_context.ipp:64
#14 operator() (__closure=0x30b31a8) at broker/core/src/pool.cc:101
#15 std::__invoke_impl<void, com::centreon::broker::pool::pool(const std::shared_ptr<boost::asio::io_context>&, size_t)::<lambda()> > (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#16 std::__invoke<com::centreon::broker::pool::pool(const std::shared_ptr<boost::asio::io_context>&, size_t)::<lambda()> > (__fn=...) at /usr/include/c++/11/bits/invoke.h:96
#17 std::thread::_Invoker<std::tuple<com::centreon::broker::pool::pool(const std::shared_ptr<boost::asio::io_context>&, size_t)::<lambda()> > >::_M_invoke<0> (this=0x30b31a8) at /usr/include/c++/11/bits/std_thread.h:259
#18 std::thread::_Invoker<std::tuple<com::centreon::broker::pool::pool(const std::shared_ptr<boost::asio::io_context>&, size_t)::<lambda()> > >::operator() (this=0x30b31a8) at /usr/include/c++/11/bits/std_thread.h:266
#19 std::thread::_State_impl<std::thread::_Invoker<std::tuple<com::centreon::broker::pool::pool(const std::shared_ptr<boost::asio::io_context>&, size_t)::<lambda()> > > >::_M_run(void) (this=0x30b31a0) at /usr/include/c++/11/bits/std_thread.h:211
#20 0x00007f9ec20db924 in execute_native_thread_routine () from /lib64/libstdc++.so.6
#21 0x00007f9ec1c9f802 in start_thread () from /lib64/libc.so.6
#22 0x00007f9ec1c3f450 in clone3 () from /lib64/libc.so.6
(gdb)
tphakala commented 3 months ago

Still happening with centreon-broker-cbd-23.10.5-1.el9.x86_64