tikv / rust-prometheus

Prometheus instrumentation library for Rust applications
Apache License 2.0
1.07k stars 182 forks source link

panic in processcollector #443

Closed wexgjduv closed 1 year ago

wexgjduv commented 2 years ago

Hi, I'm using this library to export metrics in Pulsar Elasticsearch Sync, however, ProcessCollector occasionally causes tokio runtime worker panic. Below is the backtrace:

thread 'tokio-runtime-worker' panicked at 'Once instance has previously been poisoned', library/std/src/sync/once.rs:393:21
stack backtrace:
   0:     0x55b11f16f11f - std::backtrace_rs::backtrace::libunwind::trace::h093d4af0eabdfc15
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x55b11f16f11f - std::backtrace_rs::backtrace::trace_unsynchronized::h2b90813d74c759ca
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55b11f16f11f - std::sys_common::backtrace::_print_fmt::hfaa8856bf3eca13f
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x55b11f16f11f - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h0cbaef3adcb5a454
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x55b11ed503dc - core::fmt::write::h35a8eb836b847360
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/core/src/fmt/mod.rs:1149:17
   5:     0x55b11f16d964 - std::io::Write::write_fmt::h45f2b8390f189782
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/io/mod.rs:1697:15
   6:     0x55b11f16e0b3 - std::sys_common::backtrace::_print::h56f62073b0e62985
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x55b11f16e0b3 - std::sys_common::backtrace::print::h152fba05ec38941b
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x55b11f16e0b3 - std::panicking::default_hook::{{closure}}::ha3121a0b8482251f
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/panicking.rs:211:50
   9:     0x55b11f16d42d - std::panicking::default_hook::hde5d78c11ae3b8f6
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/panicking.rs:228:9
  10:     0x55b11f16d42d - std::panicking::rust_panic_with_hook::he6f55c3e7ed1777c
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/panicking.rs:606:17
  11:     0x55b11f16cf54 - std::panicking::begin_panic::{{closure}}::h0dff6a8ffa2ba94e
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/panicking.rs:526:9
  12:     0x55b11f16cf26 - std::sys_common::backtrace::__rust_end_short_backtrace::h3325e53bf3c034be
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys_common/backtrace.rs:139:18
  13:     0x55b11ec354e6 - std::panicking::begin_panic::h12098f5c3c227e43
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/panicking.rs:525:12
  14:     0x55b11ec368cc - std::sync::once::Once::call_inner::h8624e8ff74f304d5
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sync/once.rs:393:21
  15:     0x55b11ee0739b - procfs::process::stat::Stat::from_reader::h3ad29797ca5fd3ba
  16:     0x55b11ee3eb4b - <prometheus::process_collector::ProcessCollector as prometheus::metrics::Collector>::collect::h6ee1066b89dc408b
  17:     0x55b11ee34241 - prometheus::registry::Registry::gather::he4484511235faf31
  18:     0x55b11ef69daa - scoped_tls::ScopedKey<T>::set::ha882b1bcd053bb2e
  19:     0x55b11ef81fc0 - hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_inner::h9c90445e9787497b
  20:     0x55b11ef23d53 - <hyper::server::conn::spawn_all::NewSvcTask<I,N,S,E,W> as core::future::future::Future>::poll::hb39dbc8abafa9b90
  21:     0x55b11ef79c55 - tokio::runtime::task::raw::poll::h2682bf19a251c2e9
  22:     0x55b11f1ae1d3 - tokio::runtime::thread_pool::worker::Context::run_task::h7bc4b40300593e4e
  23:     0x55b11f1a10f0 - tokio::runtime::task::raw::poll::hd19bf7c76a3c5485
  24:     0x55b11f1a246d - std::sys_common::backtrace::__rust_begin_short_backtrace::h6730a900226327c1
  25:     0x55b11f1b2dbd - core::ops::function::FnOnce::call_once{{vtable.shim}}::h5d3fa9f1c0f11836
  26:     0x55b11f195f05 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h3604301cdaaa9dbf
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/alloc/src/boxed.rs:1694:9
  27:     0x55b11f195f05 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h4cf736d2de892eff
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/alloc/src/boxed.rs:1694:9
  28:     0x55b11f195f05 - std::sys::unix::thread::Thread::new::thread_start::h71a82d4ee5b02c9b
                               at /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys/unix/thread.rs:106:17
  29:     0x7f65d0c58fa3 - start_thread
  30:     0x7f65d09e74cf - clone
  31:                0x0 - <unknown>

I'm using prometheus 0.12.

Thanks

breezewish commented 2 years ago

Seems that this is not the first time panic occurs, as the once will keep panic as long as the initial once_fn is paniced. Could you find some previous panic messages that is not reported as "poisoned"?

wexgjduv commented 2 years ago

Thanks @breezewish for your reply. I could find some panic messages like:

thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Other("Failed to parse patch version")', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/procfs-0.9.1/src/lib.rs:303:34

But I'm not sure these panics are related.

Most of the time, 'Once instance has previously been poisoned' is the ONLY panic.

lucab commented 2 years ago

I think this is a duplicate of https://github.com/tikv/rust-prometheus/issues/414, which should be already fixed in latest release.

lucab commented 1 year ago

Closing this report, supposedly already fixed.