streamnative / pulsar-archived

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org
Apache License 2.0
73 stars 26 forks source link

ISSUE-12887: c++ client producer 异步发送崩溃 #3294

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/pulsar#12887


Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

Expected behavior 客户端异步发送数据到pulsar,当服务器一个节点故障,会导致客户端崩溃。

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

out 2021-11-19 14:51:09.430 INFO [140736666261248] ClientConnection:1508 | [10.100.12.101:39852 -> 10.100.12.10:6651] Connection closed 2021-11-19 14:51:09.457 WARN [140735399585536] ClientConnection:1432 | [10.100.12.101:39862 -> 10.100.12.10:6651] Forcing connection to close after keep-alive timeout 2021-11-19 14:51:09.457 INFO [140735399585536] ClientConnection:1508 | [10.100.12.101:39862 -> 10.100.12.10:6651] Connection closed 2021-11-19 14:51:09.457 INFO [140735399585536] HandlerBase:132 | [persistent://public/default/String300AF, pulsar-16-0] Schedule reconnection in 0.1 s

Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff837fe700 (LWP 29071)] 0x00007ffff770e56d in ?? () from /lib64/libstdc++.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64

(gdb) bt

0 0x00007ffff770e56d in ?? () from /lib64/libstdc++.so.6

1 0x00000000006baad4 in pulsar::ClientConnection::handleSendPair(boost::system::error_code const&) ()

2 0x00000000006ce106 in void boost::asio::detail::strand_service::dispatch<boost::asio::detail::binder2<AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::system::error_code, unsigned long> >(boost::asio::detail::strand_service::strand_impl&, boost::asio::detail::binder2<AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::*)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::system::error_code, unsigned long>&) ()

3 0x00000000006ce360 in void boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::*)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running>::operator()<boost::system::error_code, unsigned long>(boost::system::error_code const&, unsigned long const&) ()

4 0x00000000006d8714 in boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::CompositeSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::*)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> >::operator()(boost::system::error_code const&, unsigned long, int) ()

5 0x00000000006d7b31 in boost::asio::ssl::detail::io_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::ssl::detail::write_op<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, pulsar::CompositeSharedBuffer<2> > >, boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::CompositeSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::*)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> > >::operator()(boost::system::error_code, unsigned long, int) ()

6 0x00000000006e1f6b in boost::asio::detail::completion_handler<boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::ssl::detail::io_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::ssl::detail::write_op<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, pulsar::CompositeSharedBuffer<2> > >, boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::Compos---Type to continue, or q to quit---

iteSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> > > >, boost::system::error_code, unsigned long>, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> > > >::do_complete(boost::asio::detail::task_io_service, boost::asio::detail::task_io_service_operation, boost::system::error_code const&, unsigned long) ()

7 0x00000000006e2585 in void boost::asio::detail::strand_service::dispatch<boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::ssl::detail::io_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::ssl::detail::write_op<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, pulsar::CompositeSharedBuffer<2> > >, boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::CompositeSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> > > >, boost::system::error_code, unsigned long>, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> > > >(boost::asio::detail::strand_service::strand_impl&, boost::asio::detail::rewrapped_handler<boost::asio::detail::binder2<boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::ssl::detail::io_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::ssl::detail::write_op<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, pulsar::CompositeSharedBuffer<2> > >, boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::CompositeSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> > > >, boost::system::error_code, unsigned long>, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::*)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> > >&) ()

8 0x00000000006e2944 in boost::asio::detail::reactive_socket_send_op<boost::asio::mutable_buffers_1, boost::asio::detail::write_op<boost::asio::basic_stream_socke---Type to continue, or q to quit---

t<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::mutable_buffers_1, boost::asio::detail::transfer_all_t, boost::asio::ssl::detail::io_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >, boost::asio::ssl::detail::write_op<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, pulsar::CompositeSharedBuffer<2> > >, boost::asio::detail::write_op<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service >&>, pulsar::CompositeSharedBuffer<2>, boost::asio::detail::transfer_all_t, boost::asio::detail::wrapped_handler<boost::asio::io_service::strand, AllocHandler<std::_Bind<std::_Mem_fn<void (pulsar::ClientConnection::)(boost::system::error_code const&)> (std::shared_ptr, std::_Placeholder<1>)> >, boost::asio::detail::is_continuation_if_running> > > > >::do_complete(boost::asio::detail::task_io_service, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) ()

9 0x00000000005c6851 in boost::asio::detail::task_io_service::run(boost::system::error_code&) ()

10 0x00000000005c34e6 in pulsar::ExecutorService::startWorker(std::shared_ptr) ()

11 0x00000000005c7722 in std::thread::_Impl<std::_Bind_simple<std::_Bind<std::_Mem_fn<void (pulsar::ExecutorService::)(std::shared_ptr)> (pulsar::ExecutorService, std::shared_ptr)> ()> >::_M_run() ()

12 0x00007ffff7768330 in ?? () from /lib64/libstdc++.so.6

13 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0

14 0x00007ffff6ecb9fd in clone () from /lib64/libc.so.6

iTigeroar commented 2 years ago

考虑到在centos8上运行一直正常,所以可能的原因是: 在centos7。4下编译使用了静态库libpulsarwithdeps.a,正常运行没有问题,如果server端关闭某个bokie就会崩溃。 改成使用全部使用动态连接 libpulsarnossl.so.2.8.1 ,测试一次没有再报错。 以上信息供参考。

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.