tokio-rs / bytes

Utilities for working with bytes
MIT License
1.91k stars 288 forks source link

Core dump occurs sometimes when drop Shared memory. #564

Open Chengwch opened 2 years ago

Chengwch commented 2 years ago

When I use bytes to parse HTTP2 request headers, I sometimes get a coredump, but I don't know what the possible cause is. I'm not sure if the latest version will have this problem yet.

Bytes version: 0.5.4 h2 version: 0.2.4

The GDB trace is as follows:

#0  0x000055f2e19380d4 in bytes::bytes_mut::shared_v_drop () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#1  0x000055f2e18e351d in h2::hpack::decoder::Decoder::decode () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#2  0x000055f2e18f5d28 in h2::frame::headers::HeaderBlock::load () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#3  0x000055f2e13de55c in <h2::codec::framed_read::FramedRead<T> as futures_core::stream::Stream>::poll_next () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#4  0x000055f2e0f65de3 in h2::proto::connection::Connection<T,P,B>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#5  0x000055f2e100cbda in <futures_util::future::poll_fn::PollFn<F> as core::future::future::Future>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#6  0x000055f2e14c4910 in <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#7  0x000055f2e108f923 in <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#8  0x000055f2e0f4865d in tokio::task::core::Core<T>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#9  0x000055f2e0d3fb22 in tokio::task::harness::Harness<T,S>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#10 0x000055f2e18fe187 in tokio::runtime::thread_pool::worker::GenerationGuard::run_task () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#11 0x000055f2e18fd96b in tokio::runtime::thread_pool::worker::GenerationGuard::run () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#12 0x000055f2e1905809 in std::thread::local::LocalKey<T>::with () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#13 0x000055f2e18fd807 in tokio::runtime::thread_pool::worker::Worker::run () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#14 0x000055f2e1906f67 in tokio::task::core::Core<T>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#15 0x000055f2e191b2c3 in tokio::task::harness::Harness<T,S>::poll () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#16 0x000055f2e19009eb in tokio::runtime::context::enter () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#17 0x000055f2e191ac07 in std::sys_common::backtrace::__rust_begin_short_backtrace () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#18 0x000055f2e18ffa58 in core::ops::function::FnOnce::call_once{{vtable-shim}} () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:154
#19 0x000055f2e1978033 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/alloc/src/boxed.rs:1854
#20 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/alloc/src/boxed.rs:1854
#21 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:108
(gdb) disassemble
Dump of assembler code for function _ZN5bytes9bytes_mut13shared_v_drop17hde8cf96771c42cfaE.llvm.13582749843687747861:
   0x000055f2e19380d0 <+0>: push   %rbx
   0x000055f2e19380d1 <+1>: mov    (%rdi),%rbx
=> 0x000055f2e19380d4 <+4>: lock subq $0x1,0x20(%rbx)
   0x000055f2e19380da <+10>: jne    0x55f2e1938107 <_ZN5bytes9bytes_mut13shared_v_drop17hde8cf96771c42cfaE.llvm.13582749843687747861+55>
   0x000055f2e19380dc <+12>: mov    0x8(%rbx),%rsi
   0x000055f2e19380e0 <+16>: test   %rsi,%rsi
   0x000055f2e19380e3 <+19>: je     0x55f2e19380f3 <_ZN5bytes9bytes_mut13shared_v_drop17hde8cf96771c42cfaE.llvm.13582749843687747861+35>
   0x000055f2e19380e5 <+21>: mov    (%rbx),%rdi
   0x000055f2e19380e8 <+24>: mov    $0x1,%edx
   0x000055f2e19380ed <+29>: callq  *0x8d867d(%rip)        # 0x55f2e2210770
   0x000055f2e19380f3 <+35>: mov    $0x28,%esi
   0x000055f2e19380f8 <+40>: mov    $0x8,%edx
   0x000055f2e19380fd <+45>: mov    %rbx,%rdi
   0x000055f2e1938100 <+48>: pop    %rbx
   0x000055f2e1938101 <+49>: jmpq   *0x8d8669(%rip)        # 0x55f2e2210770
   0x000055f2e1938107 <+55>: pop    %rbx
   0x000055f2e1938108 <+56>: retq
Darksonn commented 2 years ago

It would be very useful to know if this is a bug that has already been fixed in a newer version of bytes or not. You may be able to use cargo patch to use the new bytes without upgrading the other crates. There are not very many breaking changes between the versions (you can see the list in our changelog), so I don't think it would be that difficult to adjust bytes or your other dependencies to work with bytes v1.

Chengwch commented 2 years ago

Thanks for your reply!

I read the change history of bytes from v0.5.4 to 1.1 and found no updates that might be relevant to this problem.

I tried to upgrade bytes, but bytes is so basic that many crates rely on it. During the upgrade of bytes, I encountered crate compilation problems due to dependencies such as http2, hyper, tokio, and tokio-util. This is mainly caused by the name switch of the bytes() function and the change of the UninitSlice structure. After the above compilation problem is solved, there are still compilations due to other dependencies. this problem occurs in earlier versions, so the upgrade scheme is completely unworkable for me.

My project uses a lot of crate, and this problem only occurs in a single production environment, which is difficult to debug. So I asked the community for help on possible directions.

lidong14 commented 2 years ago

@Darksonn What suggestions do you have for the problem of @Chengwch

Darksonn commented 2 years ago

I'm not able to determine what the issue is from what's written here. It would require someone to sit down and spend a bunch of time debugging the issue. I have already spent a long time looking at the implementation for other purposes, and unless there's a PR fixing the issue in a newer version, then I didn't spot the issue when doing so.

If you're able to create a small example that triggers the problem, then that would be useful, but I realize that this is not easy either.