rust-lang / futures-rs

Zero-cost asynchronous programming in Rust
https://rust-lang.github.io/futures-rs/
Apache License 2.0
5.29k stars 607 forks source link

"malloc_consolidate(): unaligned fastbin chunk detected" when attempting to block single-threaded tokio runtime #2863

Open nazar-pc opened 3 weeks ago

nazar-pc commented 3 weeks ago

Originally reported as https://github.com/tokio-rs/tokio/issues/6452, but I think this might be a futures problem actually.

I created a (non-minimal unfortunately) reproduction branch where last commit is showing an issue: https://github.com/subspace/subspace/tree/futures-malloc_consolidate-reproduction

On Linux it looks like this:

$ cd shared/subspace-cluster-networking
$ cargo t
   Compiling subspace-cluster-networking v0.1.0 (/web/subspace/subspace/shared/subspace-cluster-networking)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.67s
     Running unittests src/lib.rs (/home/nazar-pc/.cache/cargo/target/debug/deps/subspace_cluster_networking-c3975a08282e0a03)

running 2 tests
test request_response::tests::test_codec ... ok
malloc_consolidate(): unaligned fastbin chunk detected
error: test failed, to rerun pass `--lib`

Caused by:
  process didn't exit successfully: `/home/nazar-pc/.cache/cargo/target/debug/deps/subspace_cluster_networking-c3975a08282e0a03` (signal: 6, SIGABRT: process abort signal)

Here is a backtrace I collected originally:

``` running 2 tests [New Thread 0x7ffff7a00640 (LWP 368334)] [New Thread 0x7ffff7600640 (LWP 368335)] [Thread 0x7ffff7a00640 (LWP 368334) exited] test request_response::tests::test_codec ... ok [New Thread 0x7ffff7a00640 (LWP 368336)] malloc_consolidate(): unaligned fastbin chunk detected Thread 3 "tests::basic" received signal SIGABRT, Aborted. [Switching to Thread 0x7ffff7600640 (LWP 368335)] __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737343653440) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: Немає такого файла або каталогу. (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737343653440) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=140737343653440) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=140737343653440, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007ffff7c42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007ffff7c287f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x00007ffff7c89676 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7ddbb77 "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00007ffff7ca0cfc in malloc_printerr (str=str@entry=0x7ffff7dde5b8 "malloc_consolidate(): unaligned fastbin chunk detected") at ./malloc/malloc.c:5664 #7 0x00007ffff7ca198c in malloc_consolidate (av=av@entry=0x7ffff0000030) at ./malloc/malloc.c:4750 #8 0x00007ffff7ca2ea0 in _int_free (av=0x7ffff0000030, p=0x7ffff00558c0, have_lock=) at ./malloc/malloc.c:4674 #9 0x00007ffff7ca5453 in __GI___libc_free (mem=) at ./malloc/malloc.c:3391 #10 0x000055555560aa90 in alloc::alloc::dealloc () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/alloc/src/alloc.rs:119 #11 alloc::alloc::{impl#1}::deallocate (self=0x7ffff0055640, ptr=..., layout=...) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/alloc/src/alloc.rs:256 #12 0x0000555555794819 in alloc::boxed::{impl#8}::drop (self=0x7ffff0055638) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/alloc/src/boxed.rs:1243 #13 0x0000555555779d9e in core::ptr::drop_in_place> () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/ptr/mod.rs:514 #14 0x000055555577b4aa in core::ptr::drop_in_place>> () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/ptr/mod.rs:514 #15 0x000055555577de1f in core::ptr::drop_in_place>>> () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/ptr/mod.rs:514 #16 0x00005555557bf43b in tokio::runtime::task::core::{impl#6}::set_stage::{closure#0}>, alloc::sync::Arc> (ptr=0x7ffff0055630) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:382 #17 0x00005555557bc5f1 in tokio::loom::std::unsafe_cell::UnsafeCell>>>::with_mut>>, (), tokio::runtime::task::core::{impl#6}::set_stage::{closure_env#0}>, alloc::sync::Arc>> (self=0x7ffff0055630, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/loom/std/unsafe_cell.rs:16 #18 tokio::runtime::task::core::Core>, alloc::sync::Arc>::set_stage>, alloc::sync::Arc> (self=0x7ffff0055620, stage=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:382 #19 0x00005555557b4d59 in tokio::runtime::task::core::Core>, alloc::sync::Arc>::drop_future_or_output>, alloc::sync::Arc> (self=0x7ffff0055620) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:347 #20 0x00005555557703ad in tokio::runtime::task::harness::poll_future::{closure#0}::{impl#0}::drop>, alloc::sync::Arc> (self=0x7ffff75fdde0) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:481 #21 0x00005555557871ba in core::ptr::drop_in_place>, alloc::sync::Arc>> () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/ptr/mod.rs:514 #22 0x000055555583143e in tokio::runtime::task::harness::poll_future::{closure#0}>, alloc::sync::Arc> () at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:488 #23 0x00005555555ffcd8 in core::panic::unwind_safe::{impl#23}::call_once, tokio::runtime::task::harness::poll_future::{closure_env#0}>, alloc::sync::Arc>> (self=...) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/panic/unwind_safe.rs:272 #24 0x0000555555737651 in std::panicking::try::do_call>, alloc::sync::Arc>>, core::task::poll::Poll<()>> (data=0x7ffff75fded8) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/std/src/panicking.rs:552 #25 0x000055555573d37b in __rust_try () #26 0x0000555555733d48 in std::panicking::try, core::panic::unwind_safe::AssertUnwindSafe>, alloc::sync::Arc>>> (f=...) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/std/src/panicking.rs:516 #27 0x0000555555650c8d in std::panic::catch_unwind>, alloc::sync::Arc>>, core::task::poll::Poll<()>> ( f=) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/std/src/panic.rs:149 #28 0x00005555558302b0 in tokio::runtime::task::harness::poll_future>, alloc::sync::Arc> (core=0x7ffff0055620, cx=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:473 #29 0x000055555583440b in tokio::runtime::task::harness::Harness>, alloc::sync::Arc>::poll_inner>, alloc::sync::Arc> (self=0x7ffff75fe100) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:208 #30 0x0000555555839683 in tokio::runtime::task::harness::Harness>, alloc::sync::Arc>::poll>, alloc::sync::Arc> (self=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:153 #31 0x00005555557cf07b in tokio::runtime::task::raw::poll>, alloc::sync::Arc> (ptr=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:271 #32 0x0000555555b99ee7 in tokio::runtime::task::raw::RawTask::poll (self=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201 #33 0x0000555555a11a72 in tokio::runtime::task::LocalNotified>::run> (self=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/mod.rs:416 #34 0x00005555558507db in tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#1}>>> () at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:706 #35 0x000055555584f973 in tokio::runtime::coop::with_budget<(), tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#1}>>>> (budget=..., f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107 #36 tokio::runtime::coop::budget<(), tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#1}>>>> (f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73 #37 tokio::runtime::scheduler::current_thread::{impl#3}::run_task::{closure#0}<(), tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#1}>>>> () at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:343 #38 0x000055555584f606 in tokio::runtime::scheduler::current_thread::Context::enter<(), tokio::runtime::scheduler::current_thread::{impl#3}::run_task::{closure_env#0}<(), tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#1}>>>>> (self=0x7ffff75fe938, core=0x7ffff0006550, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:410 #39 0x000055555584f808 in tokio::runtime::scheduler::current_thread::Context::run_task<(), tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#1}>>>> (self=0x7ffff75fe938, core=0x7ffff0006550, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:343 #40 0x0000555555850746 in tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}>>> (core=0x7ffff0006550, context=0x7ffff75fe938) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:705 #41 0x000055555584fe4b in tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure#0}>>>, core::option::Option<()>> () at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:743 #42 0x00005555556cd69d in tokio::runtime::context::scoped::Scoped::set>>>, core::option::Option<()>>, (alloc::boxed::Box, core::option::Option<()>)> (self=0x7ffff76004b8, t=0x7ffff75fe930, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/scoped.rs:40 #43 0x00005555557c05ab in tokio::runtime::context::set_scheduler::{closure#0}<(alloc::boxed::Box, core::option::Option<()>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}>>>, core::option::Option<()>>> (c=0x7ffff7600480) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176 #44 0x000055555573d982 in std::thread::local::LocalKey::try_with, core::option::Option<()>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}>>>, core::option::Option<()>>>, (alloc::boxed::Box, core::option::Option<()>)> (self=0x555556007d28, f=...) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/std/src/thread/local.rs:284 #45 0x000055555573d3aa in std::thread::local::LocalKey::with, core::option::Option<()>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}>>>, core::option::Option<()>>>, (alloc::boxed::Box, core::option::Option<()>)> (self=0x555556007d28, f=) at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/std/src/thread/local.rs:260 #46 0x00005555557c055f in tokio::runtime::context::set_scheduler<(alloc::boxed::Box, core::option::Option<()>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}>>>, core::option::Option<()>>> ( v=0x7ffff75fe930, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176 #47 0x000055555584fbe0 in tokio::runtime::scheduler::current_thread::CoreGuard::enter>>>, core::option::Option<()>> (self=..., f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:743 #48 0x000055555584fe6d in tokio::runtime::scheduler::current_thread::CoreGuard::block_on>>> (self=, future=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:652 #49 0x000055555584c9ba in tokio::runtime::scheduler::current_thread::{impl#0}::block_on::{closure#0}>> (blocking=0x7ffff75fead0) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:175 #50 0x000055555564e0e7 in tokio::runtime::context::runtime::enter_runtime>>, ()> (handle=0x7ffff75ff340, allow_block_in_place=false, f=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65 #51 0x000055555584c8de in tokio::runtime::scheduler::current_thread::CurrentThread::block_on>> (self=0x7ffff75ff318, handle=0x7ffff75ff340, future=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/current_thread/mod.rs:167 #52 0x0000555555657acd in tokio::runtime::runtime::Runtime::block_on>> (self=0x7ffff75ff310, future=...) at /home/nazar-pc/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/runtime.rs:348 #53 0x000055555575001c in subspace_cluster_networking::tests::basic () at shared/subspace-cluster-networking/src/tests.rs:155 #54 0x00005555556404a7 in subspace_cluster_networking::tests::basic::{closure#0} () at shared/subspace-cluster-networking/src/tests.rs:59 #55 0x0000555555772db6 in core::ops::function::FnOnce::call_once () at /rustc/1684a753dbca5d23b2e03568e6fbbb48eb72d0e6/library/core/src/ops/function.rs:250 #56 0x00005555558a79b2 in core::ops::function::FnOnce::call_once core::result::Result<(), alloc::string::String>, ()> () at library/core/src/ops/function.rs:250 #57 test::__rust_begin_short_backtrace, fn() -> core::result::Result<(), alloc::string::String>> () at library/test/src/lib.rs:621 #58 0x00005555558a7131 in test::run_test_in_process::{closure#0} () at library/test/src/lib.rs:644 #59 core::panic::unwind_safe::{impl#23}::call_once, test::run_test_in_process::{closure_env#0}> () at library/core/src/panic/unwind_safe.rs:272 #60 std::panicking::try::do_call, core::result::Result<(), alloc::string::String>> () at library/std/src/panicking.rs:552 #61 std::panicking::try, core::panic::unwind_safe::AssertUnwindSafe> () at library/std/src/panicking.rs:516 #62 std::panic::catch_unwind, core::result::Result<(), alloc::string::String>> () at library/std/src/panic.rs:149 #63 test::run_test_in_process () at library/test/src/lib.rs:644 #64 test::run_test::{closure#0} () at library/test/src/lib.rs:567 --Type for more, q to quit, c to continue without paging--c #65 0x000055555586ef84 in test::run_test::{closure#1} () at library/test/src/lib.rs:595 #66 std::sys_common::backtrace::__rust_begin_short_backtrace () at library/std/src/sys_common/backtrace.rs:155 #67 0x0000555555873a3b in std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure#0} () at library/std/src/thread/mod.rs:523 #68 core::panic::unwind_safe::{impl#23}::call_once<(), std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}> () at library/core/src/panic/unwind_safe.rs:272 #69 std::panicking::try::do_call>, ()> () at library/std/src/panicking.rs:552 #70 std::panicking::try<(), core::panic::unwind_safe::AssertUnwindSafe>> () at library/std/src/panicking.rs:516 #71 std::panic::catch_unwind>, ()> () at library/std/src/panic.rs:149 #72 std::thread::{impl#0}::spawn_unchecked_::{closure#1} () at library/std/src/thread/mod.rs:522 #73 core::ops::function::FnOnce::call_once, ()> () at library/core/src/ops/function.rs:250 #74 0x0000555555da841b in alloc::boxed::{impl#48}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2018 #75 alloc::boxed::{impl#48}::call_once<(), alloc::boxed::Box, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:2018 #76 std::sys::pal::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/pal/unix/thread.rs:108 #77 0x00007ffff7c94ac3 in start_thread (arg=) at ./nptl/pthread_create.c:442 #78 0x00007ffff7d26850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 ```

The change in last commit is actually incorrect and causes panic, specifically this code is not allowed to block in single-threaded tokio runtime:

/// Joins async join handle on drop
pub(crate) struct AsyncJoinOnDrop<T>(Option<Fuse<task::JoinHandle<T>>>);

impl<T> Drop for AsyncJoinOnDrop<T> {
    fn drop(&mut self) {
        let handle = self.0.take().expect("Always called exactly once; qed");
        if !handle.is_terminated() {
            task::block_in_place(move || {
                let _ = Handle::current().block_on(handle);
            });
        }
    }
}

I expected to see a panic, but getting memory issues instead.

As shown in https://github.com/tokio-rs/tokio/issues/6452#issuecomment-2031837214 the likely cause is futures crate, FuturesUnordered to be specific.

taiki-e commented 3 weeks ago

Maybe related: https://github.com/rust-lang/futures-rs/issues/2781

nazar-pc commented 3 weeks ago

Pushed another commit into above branch with a lot of code and dependencies removed, while still reproducing an issue.

There are 2 FuturesUnordered usages there in redials and inbound_requests, having just one of them no longer triggers an issue for some reason.

Also if replacing result_receiver.await.map_err(|_| ()) with this:

result_receiver.await.unwrap();
Ok(())

Then I'm getting following interesting error:

thread 'tests::basic' panicked at library/core/src/panicking.rs:223:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.

Which might be of interest as well.