Closed heifner closed 1 year ago
Version main 3.2.x. Also reported in 3.1.x & 2.0.x & 2.1.x. Although generic message may point to different issues over time.
main
3.2.x
3.1.x
2.0.x
2.1.x
info 2022-08-09T14:00:57.388 net-1 net_plugin.cpp:1016 _close ] ["xxx:9876 - e1a715e" - 2 1.1.1.1:9876] closing info 2022-08-09T14:00:57.389 nodeos net_plugin.cpp:3809 plugin_shutdown ] exit shutdown CHAINBASE: Writing "state" database file, this could take a moment... 1% complete... 5% complete... 8% complete... 12% complete... 15% complete... 18% complete... 22% complete... 26% complete... 29% complete... 32% complete... 35% complete... 39% complete... 42% complete... 46% complete... 49% complete... 53% complete... 56% complete... 59% complete... 62% complete... 65% complete... 69% complete... 72% complete... 76% complete... 80% complete... 85% complete... 89% complete... 93% complete... 97% complete... Syncing buffers... Complete corrupted size vs. prev_size [1] 545250 abort (core dumped) /usr/bin/nodeos --config-dir /etc/nodeos -d /var/lib/nodeos
Thread dump:
Reading symbols from /usr/bin/nodeos... (No debugging symbols found in /usr/bin/nodeos) [New LWP 545250] [New LWP 545251] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/bin/nodeos --config-dir /etc/nodeos -d /var/lib/nodeos'. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 [Current thread is 1 (Thread 0x7f7fd1401840 (LWP 545250))] [?2004h(gdb) [7mthread apply all where[27m [C[C[C[C[C[Cthread apply all where [?2004l Thread 2 (Thread 0x7f7fd1400700 (LWP 545251)): #0 futex_wait_cancelable (private=0, expected=0, futex_word=0x5555b3238b28) at ../sysdeps/nptl/futex-internal.h:186 #1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x5555b3238ac8, cond=0x5555b3238b00) at pthread_cond_wait.c:508 #2 __pthread_cond_wait (cond=0x5555b3238b00, mutex=0x5555b3238ac8) at pthread_cond_wait.c:638 #3 0x00005555af6441d3 in boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) () #4 0x00005555af643e11 in boost::asio::detail::scheduler::run(boost::system::error_code&) () #5 0x00005555af643bde in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, appbase::application_impl::application_impl()::{lambda()#1}> >(void*) () #6 0x00007f7fd173eea7 in start_thread (arg=<optimized out>) at pthread_create.c:477 #7 0x00007f7fd1503def in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7f7fd1401840 (LWP 545250)): #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f7fd142b537 in __GI_abort () at abort.c:79 #2 0x00007f7fd1484768 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f7fd1592e2d "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #3 0x00007f7fd148ba5a in malloc_printerr (str=str@entry=0x7f7fd1591020 "corrupted size vs. prev_size") at malloc.c:5347 #4 0x00007f7fd148c7a6 in unlink_chunk (p=p@entry=0x7f7d44001170, av=0x7f7d44000020) at malloc.c:1454 #5 0x00007f7fd148c8f7 in malloc_consolidate (av=av@entry=0x7f7d44000020) at malloc.c:4502 #6 0x00007f7fd148d0c0 in _int_free (av=0x7f7d44000020, p=0x7f7d4400b5f0, have_lock=<optimized out>) at malloc.c:4400 #7 0x00005555afcb159c in eosio::http_plugin_impl::make_app_thread_url_handler(int, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>)>, std::__1::shared_ptr<eosio::http_plugin_impl>)::{lambda(std::__1::shared_ptr<eosio::detail::abstract_conn>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>)#1}::operator()(std::__1::shared_ptr<eosio::detail::abstract_conn>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>) const::{lambda()#1}::~shared_ptr() () #8 0x00005555afcb1d25 in boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<boost::asio::executor_binder<eosio::http_plugin_impl::make_app_thread_url_handler(int, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>)>, std::__1::shared_ptr<eosio::http_plugin_impl>)::{lambda(std::__1::shared_ptr<eosio::detail::abstract_conn>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>)#1}::operator()(std::__1::shared_ptr<eosio::detail::abstract_conn>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::function<void (int, std::__1::optional<fc::variant>)>) const::{lambda()#1}, appbase::execution_priority_queue::executor> >, std::__1::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) () #9 0x00005555af646975 in boost::asio::detail::scheduler::shutdown() () #10 0x00005555af6492b9 in std::__1::__shared_ptr_emplace<boost::asio::io_context, std::__1::allocator<boost::asio::io_context> >::__on_zero_shared() () #11 0x00005555af63f8a4 in appbase::application::exec() () #12 0x00005555af632f40 in main () [?2004h(gdb) quit
Maybe related: https://github.com/EOSIO/eos/issues/8450
Quick glance, looks like http plugin url_handlers iterator in use after url_handlers.clear() in http_plugin::plugin_shutdown. See handle_http_request use of iterator into url_handlers.
url_handlers
url_handlers.clear()
http_plugin::plugin_shutdown
handle_http_request
Note http rewrite currently in work: https://github.com/eosnetworkfoundation/mandel/pull/675 Should verify any fix is also applied to this if appropriate. I think it is also worth fixing in 3.1 which will not have #675.
We are regularly running "get info" HTTP requests on nodeos every few seconds. This is for monitoring. Those run even when nodeos is shuttting down.
Will track this in linked leap issue now
leap
Version
main
3.2.x
. Also reported in3.1.x
&2.0.x
&2.1.x
. Although generic message may point to different issues over time.Thread dump:
Maybe related: https://github.com/EOSIO/eos/issues/8450
Quick glance, looks like http plugin
url_handlers
iterator in use afterurl_handlers.clear()
inhttp_plugin::plugin_shutdown
. Seehandle_http_request
use of iterator intourl_handlers
.Note http rewrite currently in work: https://github.com/eosnetworkfoundation/mandel/pull/675 Should verify any fix is also applied to this if appropriate. I think it is also worth fixing in 3.1 which will not have #675.