databendlabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.85k stars 750 forks source link

refactor(base): add stacktrace to replace backtrace #16643

Closed zhang2014 closed 4 days ago

zhang2014 commented 3 weeks ago

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(base): add stacktrace to replace backtrace

// Rewrite the backtrace on linux ELF using gimli-rs.
//
// Differences from backtrace-rs[https://github.com/rust-lang/backtrace-rs]:
// - Almost lock-free (backtrace-rs requires large-grained locks or frequent lock operations)
// - Symbol resolution is lazy, only resolved when outputting
// - Cache the all stack frames for the stack, not just a single stack frame
// - Output the physical addresses of the stack instead of virtual addresses, even in the absence of symbols (this will help us use backtraces to get cause in the case of splitted symbol tables)
// - Output inline functions and marked it
//
// What's different from gimli-addr2line[https://github.com/gimli-rs/addr2line](why not use gimli-addr2line):
// - Use aranges to optimize the lookup of DWARF units (if present)
// - gimli-addr2line caches and sorts the symbol tables to speed up symbol lookup, which would introduce locks and caching (but in reality, symbol lookup is a low-frequency operation in databend, and rapid reconstruction based on mmap is sufficient).

Tests

Type of change


This change isโ€‚Reviewable

zhang2014 commented 3 weeks ago

binary size:

ls -lsh ./target/release/databend-query*
332M -rwxr-xr-x 2 ubuntu ubuntu 332M Oct 19 11:07 ./target/release/databend-query
1.2G -rw-r--r-- 1 ubuntu ubuntu 1.2G Oct 19 11:16 ./target/release/databend-query.debug

default:

./target/release/databend-query

   0: backtrace::backtrace::libunwind::trace[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/libunwind.rs:116:5
   1: backtrace::backtrace::trace_unsynchronized[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/mod.rs:66:5
   2: databend_common_exception::exception_backtrace::StackTrace::capture_frames[inlined]
             at /workspace/src/common/exception/src/exception_backtrace.rs:150:13
   3: databend_common_exception::exception_backtrace::StackTrace::capture@50cee64
             at /workspace/src/common/exception/src/exception_backtrace.rs:143:9
   4: databend_query::main@9194388
             at /workspace/src/binaries/query/ee_main.rs:42:23
   5: core::ops::function::FnOnce::call_once[inlined]
             at /rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/core/src/ops/function.rs:250:5
   6: std::sys::backtrace::__rust_begin_short_backtrace@9194c24

remove debug file

rm ./target/release/databend-query.debug 
./target/release/databend-query

   0: <unknown>@50cee64
   1: databend_query::main::hb7b11b0ec1f24acb@9194388
   2: <unknown>@9194c24
   3: <unknown>@919d644
   4: <unknown>@a677100
   5: <unknown>@9194844
   6: <unknown>@284c4
   7: __libc_start_main@28598
   8: <unknown>@414d034

use addr2line to parse address

addr2line -e  ./target/databend-query.debug -a 50cee64 -a 9194388 -a 9194c24 -f -i -C
0x00000000050cee64
databend_common_exception::exception_backtrace::StackTrace::capture
/workspace/src/common/exception/src/exception_backtrace.rs:144
0x0000000009194388
databend_query::main
/workspace/src/binaries/query/ee_main.rs:44
0x0000000009194c24
std::sys::backtrace::__rust_begin_short_backtrace
/rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/std/src/sys/backtrace.rs:161

restore debug file

cp target/databend-query.debug target/release/databend-query.debug
./target/release/databend-query

   0: backtrace::backtrace::libunwind::trace[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/libunwind.rs:116:5
   1: backtrace::backtrace::trace_unsynchronized[inlined]
             at /opt/rust/cargo/git/checkouts/backtrace-rs-fb1f822361417489/72265be/src/backtrace/mod.rs:66:5
   2: databend_common_exception::exception_backtrace::StackTrace::capture_frames[inlined]
             at /workspace/src/common/exception/src/exception_backtrace.rs:150:13
   3: databend_common_exception::exception_backtrace::StackTrace::capture@50cee64
             at /workspace/src/common/exception/src/exception_backtrace.rs:143:9
   4: databend_query::main@9194388
             at /workspace/src/binaries/query/ee_main.rs:42:23
   5: core::ops::function::FnOnce::call_once[inlined]
             at /rustc/cf2df68d1f5e56803c97d91e2b1a9f1c9923c533/library/core/src/ops/function.rs:250:5
   6: std::sys::backtrace::__rust_begin_short_backtrace@9194c24
zhang2014 commented 1 week ago

aarch64-unknown-linux-gnu and x86_64-unknown-linux-musl test passed.

andylokandy commented 1 week ago

@zhang2014 Good job! Will you consider to publish it as a crate?

zhang2014 commented 1 week ago

@zhang2014 Good job! Will you consider to publish it as a crate?

Itโ€˜s only rewritten for ELF, so I think it may not be able to handle all scenarios.

Xuanwo commented 1 week ago

be able to handle all scenarios.

A crate doesn't need to handle all scenarios. It's quite useful even if itโ€™s only rewritten for ELF. I encourage publishing a crate for us, which may attract other contributors and make it much easier to test and reuse.