AntelopeIO / spring

C++ implementation of the Antelope protocol with Savanna consensus
Other
9 stars 5 forks source link

P2P: segfault on connection creation #835

Closed heifner closed 1 month ago

heifner commented 1 month ago

A number of recent runs of CI/CD. Test failure of p2p_sync_throttle_test https://github.com/AntelopeIO/spring/actions/runs/11084150668/job/30799334499 Core file shows:

* thread #1, name = 'nodeos', stop reason = signal SIGABRT
  * frame #0: 0x0000726c48331a7c libc.so.6`pthread_kill + 300
    frame #1: 0x0000726c482dd476 libc.so.6`raise + 22
    frame #2: 0x0000726c482c37f3 libc.so.6`abort + 211
    frame #3: 0x0000726c4832445c libc.so.6`___lldb_unnamed_symbol3389 + 348
    frame #4: 0x0000726c48324770 libc.so.6`__libc_fatal + 32
    frame #5: 0x0000726c483df71f libc.so.6`__netlink_assert_response + 255
    frame #6: 0x0000726c483dee8e libc.so.6`___lldb_unnamed_symbol3939 + 1694
    frame #7: 0x0000726c483a7ed1 libc.so.6`getaddrinfo + 2593
    frame #8: 0x00005d2241792b0f nodeos`boost::asio::detail::resolve_query_op<boost::asio::ip::tcp, eosio::connections_manager::resolve_and_connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<eosio::connection> const&)::'lambda'(boost::system::error_code const&, boost::asio::ip::basic_resolver_results<boost::asio::ip::tcp> const&), boost::asio::any_io_executor>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) + 2351
    frame #9: 0x00005d2241474c3c nodeos`boost::asio::detail::scheduler::run(boost::system::error_code&) + 1308
    frame #10: 0x00005d22417c392d nodeos`boost::asio::detail::posix_thread::func<boost::asio::detail::resolver_service_base::work_scheduler_runner>::run() + 61
    frame #11: 0x00005d224146e0f4 nodeos`boost_asio_detail_posix_thread_function + 20
    frame #12: 0x0000726c4832fb43 libc.so.6`___lldb_unnamed_symbol3481 + 755
    frame #13: 0x0000726c483c1a00 libc.so.6`___lldb_unnamed_symbol3865 + 11

Introduced by #825, where the connect was moved to the connection strand instead of being on the same thread as the async_resolve callback.

heifner commented 1 month ago

Fix also needs to be backported to Leap 5.0. Going to fix first in Spring to verify fix before backporting.