GreptimeTeam / greptimedb

An Open-Source, Cloud-Native, Unified Time Series Database for Metrics, Events, and Logs with SQL/PromQL supported. Available on GreptimeCloud.
https://greptime.com/
Apache License 2.0
3.99k stars 288 forks source link

Program panic when profiling #4154

Open cjwcommuny opened 4 weeks ago

cjwcommuny commented 4 weeks ago

What type of bug is this?

Crash

What subsystems are affected?

Standalone mode

Minimal reproduce step

Run db with the pprof feature:

cargo run --features=pprof -- standalone start

and execute the profile command:

curl 'localhost:4000/v1/prof/cpu?seconds=60&output=text'

What did you expect to see?

No crash.

What did you see instead?

The program crashed.

What operating system did you use?

TencentOS 3.1

What version of GreptimeDB did you use?

main-2faa6d6

Relevant log output and stack trace

2024-06-16T11:54:03.092852Z  INFO log: starting cpu profiler    
2024-06-16T11:55:03.207489Z ERROR common_telemetry::panic_hook: panicked at library/core/src/panicking.rs:215:5:
unsafe precondition(s) violated: slice::from_raw_parts requires the pointer to be aligned and non-null, and the total size of the slice not to exceed `isize::MAX` backtrace=   0: common_telemetry::panic_hook::set_panic_hook::{{closure}}
             at /home/admin/dev/greptimedb/src/common/telemetry/src/panic_hook.rs:37:25
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/alloc/src/boxed.rs:2036:9
      std::panicking::rust_panic_with_hook
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/std/src/panicking.rs:799:13
   2: std::panicking::begin_panic_handler::{{closure}}
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/std/src/panicking.rs:656:13
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/std/src/sys_common/backtrace.rs:171:18
   4: rust_begin_unwind
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/std/src/panicking.rs:652:5
   5: core::panicking::panic_nounwind_fmt::runtime
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/panicking.rs:110:18
      core::panicking::panic_nounwind_fmt
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/panicking.rs:120:5
   6: core::panicking::panic_nounwind
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/panicking.rs:215:5
   7: core::slice::raw::from_raw_parts::precondition_check
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/ub_checks.rs:66:21
   8: core::slice::raw::from_raw_parts
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/slice/raw.rs:96:9
   9: <pprof::collector::TempFdArrayIterator<T> as core::iter::traits::iterator::Iterator>::next
             at /home/admin/.cargo/registry/src/mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd/pprof-0.13.0/src/collector.rs:225:26
  10: core::iter::traits::iterator::Iterator::fold
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/iter/traits/iterator.rs:2586:29
  11: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::fold
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/iter/adapters/chain.rs:93:19
  12: core::iter::traits::iterator::Iterator::for_each
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/iter/traits/iterator.rs:817:9
  13: pprof::report::ReportBuilder::build
             at /home/admin/.cargo/registry/src/mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd/pprof-0.13.0/src/report.rs:110:17
  14: servers::http::pprof::nix::Profiling::report::{{closure}}
             at /home/admin/dev/greptimedb/src/servers/src/http/pprof/nix.rs:99:9
  15: servers::http::pprof::handler::pprof_handler::{{closure}}
             at /home/admin/dev/greptimedb/src/servers/src/http/pprof.rs:73:49
  16: <F as axum::handler::Handler<(M,T1),S,B>>::call::{{closure}}
             at /home/admin/.cargo/registry/src/mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd/axum-0.6.20/src/handler/mod.rs:248:53
  17: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/f9b16149208c8a8a349c32813312716f6603eb6f/library/core/src/future/future.rs:123:9
  18: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
evenyag commented 4 weeks ago

Thanks for your report! I guess this is related to https://github.com/tikv/pprof-rs/issues/232

v0y4g3r commented 4 weeks ago

Maybe you can use cargo flamegraph to collect CPU profile data instead of pprof. @evenyag can we remove pprof related code?