chaincodelabs / libmultiprocess

C++ library and code generator making it easy to call functions and reference objects in different processes
MIT License
29 stars 20 forks source link

MSAN Error during connection shutdown #115

Closed ryanofsky closed 1 month ago

ryanofsky commented 1 month ago

Originally posted by @Sjors in https://github.com/bitcoin/bitcoin/issues/30975#issuecomment-2385476424

MSAN says https://cirrus-ci.com/task/6248232848719872:

SUMMARY: MemorySanitizer: use-of-uninitialized-value /usr/src/mp/proxy.cpp:146:25 in mp::Connection::addAsyncCleanup(std::__1::function<void ()>)
[10:00:27.923] Exiting

Error happens in ipc_test.cpp:74:

        connection_server->onDisconnect([&] { connection_server.reset(); });

And stack trace shows mp::Connection being destroyed its RpcSystemBase member being destroyed, which garbage collects mp::ProxyServer objects which call mp::Connection::addAsyncCleanup which accesses invalid memory because mp::Connection object is in the process of being destroyed.

I think this problem points to a general issue with the shutdown sequence and is not just limited to tests because ServeStream function follows the same pattern of destroying the connection object an onDisconnect handler.

If this is true, it means already-destroyed fields of Connection objects are accessed as the objects are being destroyed, as detected by MSAN, but also that some cleanup functions may not run, resulting in small memory leaks.

MSAN Error

```c++ [unknown] [src/test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-15256/b-test-15260} IPC server send response #7 FooInterface.passBlockState$Results (result = (mode = 0, result = 0, reje==15256==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x557fb2ac5ded in std::__1::list, std::__1::allocator>>::__link_nodes(std::__1::__list_node_base, void*>*, std::__1::__list_node_base, void*>*, std::__1::__list_node_base, void*>*) /msan/cxx_build/include/c++/v1/list:957:25 #1 0x557fb2ac5ded in std::__1::__list_iterator, void*> std::__1::list, std::__1::allocator>>::emplace>(std::__1::__list_const_iterator, void*>, std::__1::function&&) /msan/cxx_build/include/c++/v1/list:1300:3 #2 0x557fb2ac5ded in mp::Connection::addAsyncCleanup(std::__1::function) /usr/src/mp/proxy.cpp:146:25 #3 0x557fb1cf3703 in mp::ProxyServerBase::~ProxyServerBase() ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/mp/proxy-io.h:480:31 #4 0x557fb1cf25f4 in mp::ProxyServerCustom::~ProxyServerCustom() ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/mp/proxy.h:160:8 #5 0x557fb1cf25f4 in mp::ProxyServer::~ProxyServer() ci/scratch/build-x86_64-pc-linux-gnu/src/./ci/scratch/build-x86_64-pc-linux-gnu/src/test/ipc_test.capnp.proxy-types.c++:8:72 #6 0x557fb1cf25f4 in mp::ProxyServer::~ProxyServer() ci/scratch/build-x86_64-pc-linux-gnu/src/./ci/scratch/build-x86_64-pc-linux-gnu/src/test/ipc_test.capnp.proxy-types.c++:8:48 #7 0x557fb1cf25f4 in mp::ProxyServer::~ProxyServer() ci/scratch/build-x86_64-pc-linux-gnu/src/./ci/scratch/build-x86_64-pc-linux-gnu/src/test/ipc_test.capnp.proxy-types.c++:8:48 #8 0x557fb1cacea9 in kj::_::HeapDisposer>::disposeImpl(void*) const ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/kj/memory.h:557:60 #9 0x557fb2b2255b in kj::Disposer::Dispose_::dispose(capnp::Capability::Server*, kj::Disposer const&) /usr/src/kj/memory.h:669:14 #10 0x557fb2b2255b in void kj::Disposer::dispose(capnp::Capability::Server*) const /usr/src/kj/memory.h:681:3 #11 0x557fb2b2255b in kj::Own::dispose() /usr/src/kj/memory.h:284:17 #12 0x557fb2b2255b in kj::Own::~Own() /usr/src/kj/memory.h:210:28 #13 0x557fb2b2255b in kj::Maybe>::~Maybe() /usr/src/kj/common.h:1376:7 #14 0x557fb2b2255b in capnp::LocalClient::~LocalClient() /usr/src/capnp/capability.c++:563:3 #15 0x557fb2b22815 in capnp::LocalClient::~LocalClient() /usr/src/capnp/capability.c++:559:34 #16 0x557fb2b22815 in non-virtual thunk to capnp::LocalClient::~LocalClient() /usr/src/capnp/capability.c++ #17 0x557fb2fbaab4 in kj::Refcounted::disposeImpl(void*) const /usr/src/kj/refcount.c++:43:5 #18 0x557fb2c28eff in kj::Disposer::Dispose_::dispose(capnp::ClientHook*, kj::Disposer const&) /usr/src/kj/memory.h:669:14 #19 0x557fb2c28eff in void kj::Disposer::dispose(capnp::ClientHook*) const /usr/src/kj/memory.h:681:3 #20 0x557fb2c28eff in kj::Own::dispose() /usr/src/kj/memory.h:284:17 #21 0x557fb2c28eff in kj::Own::~Own() /usr/src/kj/memory.h:210:28 #22 0x557fb2c28eff in capnp::Capability::Client::~Client() /usr/src/capnp/capability.h:190:19 #23 0x557fb2c28eff in void kj::dtor(capnp::Capability::Client&) /usr/src/kj/common.h:1060:13 #24 0x557fb2c28eff in kj::_::NullableValue::~NullableValue() /usr/src/kj/common.h:1125:7 #25 0x557fb2c28eff in kj::Maybe::~Maybe() /usr/src/kj/common.h:1376:7 #26 0x557fb2c28eff in capnp::_::RpcSystemBase::Impl::~Impl() /usr/src/capnp/rpc.c++:3567:3 #27 0x557fb2c28bd6 in kj::_::HeapDisposer::disposeImpl(void*) const /usr/src/kj/memory.h:557:60 #28 0x557fb2b44d97 in kj::Disposer::Dispose_::dispose(capnp::_::RpcSystemBase::Impl*, kj::Disposer const&) /usr/src/kj/memory.h:669:14 #29 0x557fb2b44d97 in void kj::Disposer::dispose(capnp::_::RpcSystemBase::Impl*) const /usr/src/kj/memory.h:681:3 #30 0x557fb2b44d97 in kj::Own::dispose() /usr/src/kj/memory.h:284:17 #31 0x557fb2b44d97 in kj::Own::~Own() /usr/src/kj/memory.h:210:28 #32 0x557fb2b44d97 in capnp::_::RpcSystemBase::~RpcSystemBase() /usr/src/capnp/rpc.c++:3678:50 #33 0x557fb2ac42dd in mp::Connection::~Connection() /usr/src/mp/proxy.cpp:105:1 #34 0x557fb1ca1c13 in std::__1::default_delete::operator()[abi:de190100](mp::Connection*) const /msan/cxx_build/include/c++/v1/__memory/unique_ptr.h:80:5 #35 0x557fb1ca1c13 in std::__1::unique_ptr>::reset[abi:de190100](mp::Connection*) /msan/cxx_build/include/c++/v1/__memory/unique_ptr.h:292:7 #36 0x557fb1ca1c13 in IpcPipeTest()::$_0::operator()() const::'lambda0'()::operator()() const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/test/ipc_test.cpp:74:65 #37 0x557fb1ca1c13 in kj::_::Void kj::_::MaybeVoidCaller::apply(IpcPipeTest()::$_0::operator()() const::'lambda0'()&, kj::_::Void&&) ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/kj/async-prelude.h:195:5 #38 0x557fb1ca1c13 in kj::_::TransformPromiseNode::getImpl(kj::_::ExceptionOrValue&) ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/kj/async-inl.h:739:31 #39 0x557fb2dbc3c9 in kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&)::$_0::operator()() const /usr/src/kj/async.c++:2411:3 #40 0x557fb2dbc3c9 in kj::Maybe kj::runCatchingExceptions(kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&)::$_0&&) /usr/src/kj/exception.h:371:5 #41 0x557fb2dbc3c9 in kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&) /usr/src/kj/async.c++:2411:3 #42 0x557fb2de5fc7 in kj::TaskSet::Task::fire() /usr/src/kj/async.c++:321:11 #43 0x557fb2de7831 in non-virtual thunk to kj::TaskSet::Task::fire() /usr/src/kj/async.c++ #44 0x557fb2db2a28 in kj::EventLoop::turn() /usr/src/kj/async.c++:1806:31 #45 0x557fb2db5bf2 in kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2::operator()() const /usr/src/kj/async.c++:1970:21 #46 0x557fb2db5bf2 in void kj::WaitScope::runOnStackPool&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2>(kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2&&) /usr/src/kj/async.h:1368:7 #47 0x557fb2db5bf2 in kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation) /usr/src/kj/async.c++:1967:17 #48 0x557fb2acfd3c in kj::Promise::wait(kj::WaitScope&, kj::SourceLocation) /ci_container_base/depends/x86_64-pc-linux-gnu/include/kj/async-inl.h:1357:3 #49 0x557fb2ac764a in mp::EventLoop::loop() /usr/src/mp/proxy.cpp:187:62 #50 0x557fb1c9df16 in IpcPipeTest()::$_0::operator()() const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/test/ipc_test.cpp:75:14 #51 0x557fb1c9df16 in decltype(std::declval()()) std::__1::__invoke[abi:de190100](IpcPipeTest()::$_0&&) /msan/cxx_build/include/c++/v1/__type_traits/invoke.h:149:25 #52 0x557fb1c9df16 in void std::__1::__thread_execute[abi:de190100]>, IpcPipeTest()::$_0>(std::__1::tuple>, IpcPipeTest()::$_0>&, std::__1::__tuple_indices<...>) /msan/cxx_build/include/c++/v1/__thread/thread.h:192:3 #53 0x557fb1c9df16 in void* std::__1::__thread_proxy[abi:de190100]>, IpcPipeTest()::$_0>>(void*) /msan/cxx_build/include/c++/v1/__thread/thread.h:201:3 #54 0x7f38033c0a93 (/lib/x86_64-linux-gnu/libc.so.6+0x9ca93) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6) #55 0x7f380344dc3b (/lib/x86_64-linux-gnu/libc.so.6+0x129c3b) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6) Member fields were destroyed #0 0x557faee70e5d in __sanitizer_dtor_callback_fields /msan/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1048:5 #1 0x557fb2ac4097 in std::__1::__list_imp, std::__1::allocator>>::~__list_imp() /msan/cxx_build/include/c++/v1/list:499:15 #2 0x557fb2ac4097 in std::__1::__list_imp, std::__1::allocator>>::~__list_imp() /msan/cxx_build/include/c++/v1/list:619:1 #3 0x557fb2ac4097 in mp::Connection::~Connection() /usr/src/mp/proxy.cpp:105:1 #4 0x557fb1ca1c13 in std::__1::default_delete::operator()[abi:de190100](mp::Connection*) const /msan/cxx_build/include/c++/v1/__memory/unique_ptr.h:80:5 #5 0x557fb1ca1c13 in std::__1::unique_ptr>::reset[abi:de190100](mp::Connection*) /msan/cxx_build/include/c++/v1/__memory/unique_ptr.h:292:7 #6 0x557fb1ca1c13 in IpcPipeTest()::$_0::operator()() const::'lambda0'()::operator()() const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/test/ipc_test.cpp:74:65 #7 0x557fb1ca1c13 in kj::_::Void kj::_::MaybeVoidCaller::apply(IpcPipeTest()::$_0::operator()() const::'lambda0'()&, kj::_::Void&&) ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/kj/async-prelude.h:195:5 #8 0x557fb1ca1c13 in kj::_::TransformPromiseNode::getImpl(kj::_::ExceptionOrValue&) ci/scratch/build-x86_64-pc-linux-gnu/src/./depends/x86_64-pc-linux-gnu/include/kj/async-inl.h:739:31 #9 0x557fb2dbc3c9 in kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&)::$_0::operator()() const /usr/src/kj/async.c++:2411:3 #10 0x557fb2dbc3c9 in kj::Maybe kj::runCatchingExceptions(kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&)::$_0&&) /usr/src/kj/exception.h:371:5 #11 0x557fb2dbc3c9 in kj::_::TransformPromiseNodeBase::get(kj::_::ExceptionOrValue&) /usr/src/kj/async.c++:2411:3 #12 0x557fb2de5fc7 in kj::TaskSet::Task::fire() /usr/src/kj/async.c++:321:11 #13 0x557fb2de7831 in non-virtual thunk to kj::TaskSet::Task::fire() /usr/src/kj/async.c++ #14 0x557fb2db2a28 in kj::EventLoop::turn() /usr/src/kj/async.c++:1806:31 #15 0x557fb2db5bf2 in kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2::operator()() const /usr/src/kj/async.c++:1970:21 #16 0x557fb2db5bf2 in void kj::WaitScope::runOnStackPool&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2>(kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation)::$_2&&) /usr/src/kj/async.h:1368:7 #17 0x557fb2db5bf2 in kj::_::waitImpl(kj::Own&&, kj::_::ExceptionOrValue&, kj::WaitScope&, kj::SourceLocation) /usr/src/kj/async.c++:1967:17 #18 0x557fb2acfd3c in kj::Promise::wait(kj::WaitScope&, kj::SourceLocation) /ci_container_base/depends/x86_64-pc-linux-gnu/include/kj/async-inl.h:1357:3 #19 0x557fb2ac764a in mp::EventLoop::loop() /usr/src/mp/proxy.cpp:187:62 #20 0x557fb1c9df16 in IpcPipeTest()::$_0::operator()() const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/test/ipc_test.cpp:75:14 #21 0x557fb1c9df16 in decltype(std::declval()()) std::__1::__invoke[abi:de190100](IpcPipeTest()::$_0&&) /msan/cxx_build/include/c++/v1/__type_traits/invoke.h:149:25 #22 0x557fb1c9df16 in void std::__1::__thread_execute[abi:de190100]>, IpcPipeTest()::$_0>(std::__1::tuple>, IpcPipeTest()::$_0>&, std::__1::__tuple_indices<...>) /msan/cxx_build/include/c++/v1/__thread/thread.h:192:3 #23 0x557fb1c9df16 in void* std::__1::__thread_proxy[abi:de190100]>, IpcPipeTest()::$_0>>(void*) /msan/cxx_build/include/c++/v1/__thread/thread.h:201:3 #24 0x7f38033c0a93 (/lib/x86_64-linux-gnu/libc.so.6+0x9ca93) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6) SUMMARY: MemorySanitizer: use-of-uninitialized-value /usr/src/mp/proxy.cpp:146:25 in mp::Connection::addAsyncCleanup(std::__1::function) ```

ryanofsky commented 1 month ago

I'm experimenting with this fix:

--- a/include/mp/proxy-io.h
+++ b/include/mp/proxy-io.h
@@ -333,7 +333,7 @@ public:
     LoggingErrorHandler m_error_handler{m_loop};
     kj::TaskSet m_on_disconnect{m_error_handler};
     ::capnp::TwoPartyVatNetwork m_network;
-    ::capnp::RpcSystem<::capnp::rpc::twoparty::VatId> m_rpc_system;
+    std::optional<::capnp::RpcSystem<::capnp::rpc::twoparty::VatId>> m_rpc_system;

     // ThreadMap interface client, used to create a remote server thread when an
     // client IPC call is being made for the first time from a new thread.
@@ -567,7 +567,7 @@ std::unique_ptr<ProxyClient<InitInterface>> ConnectStream(EventLoop& loop, int f
         auto stream =
             loop.m_io_context.lowLevelProvider->wrapSocketFd(fd, kj::LowLevelAsyncIoProvider::TAKE_OWNERSHIP);
         connection = std::make_unique<Connection>(loop, kj::mv(stream));
-        init_client = connection->m_rpc_system.bootstrap(ServerVatId().vat_id).castAs<InitInterface>();
+        init_client = connection->m_rpc_system->bootstrap(ServerVatId().vat_id).castAs<InitInterface>();
         Connection* connection_ptr = connection.get();
         connection->onDisconnect([&loop, connection_ptr] {
             loop.log() << "IPC client: unexpected network disconnect.";
--- a/src/mp/proxy.cpp
+++ b/src/mp/proxy.cpp
@@ -48,6 +48,12 @@ void LoggingErrorHandler::taskFailed(kj::Exception&& exception)

 Connection::~Connection()
 {
+    // Shut down RPC system first, since this will garbage collect Server
+    // objects that were not freed before the connection was closed, some of
+    // which may call addAsyncCleanup and add more cleanup callbacks which can
+    // run below.
+    m_rpc_system.reset();
+
     // ProxyClient cleanup handlers are in sync list, and ProxyServer cleanup
     // handlers are in the async list.
     //

Also requires a change to the bitcoin ipc test:

--- a/src/test/ipc_test.cpp
+++ b/src/test/ipc_test.cpp
@@ -62,7 +62,7 @@ void IpcPipeTest()

         auto connection_client = std::make_unique<mp::Connection>(loop, kj::mv(pipe.ends[0]));
         auto foo_client = std::make_unique<mp::ProxyClient<gen::FooInterface>>(
-            connection_client->m_rpc_system.bootstrap(mp::ServerVatId().vat_id).castAs<gen::FooInterface>(),
+            connection_client->m_rpc_system->bootstrap(mp::ServerVatId().vat_id).castAs<gen::FooInterface>(),
             connection_client.get(), /* destroy_connection= */ false);
         foo_promise.set_value(std::move(foo_client));
         disconnect_client = [&] { loop.sync([&] { connection_client.reset(); }); };