Open liujingchen opened 4 years ago
Hi, I just found that in our real code, it actually also exits with std::process::exit(0)
when the segfault happens. Therefore, an unclean shutdown with std::process::exit(0)
should be the cause this problem. But the reason why we have this unclean shutdown is because we are using an argument parser clap, which is a popular lib for command line applications. Clap will exit the process if you give arguments --version
or --help
.
Here is a code example to reproduce:
Cargo.toml
:
[package]
name = "..."
version = "0.1.0"
authors = ["..."]
edition = "2018"
[dependencies]
sentry = "0.20.1"
clap = "2.33.3"
[[bin]]
name = "mytest"
path = "src/main.rs"
src/main.rs
:
fn main() {
let _guard = sentry::init("https://key@sentry.io/42");
let millis = std::time::Duration::from_millis(1);
std::thread::sleep(millis);
clap::App::new("mytest").version("1.0").get_matches();
}
Build it with cargo build --release
, and run:
$ bash -c "set -e; for n in {1..100};do ./target/release/mytest --version; done"
mytest 1.0
mytest 1.0
bash: line 1: 2603054 Segmentation fault (core dumped) ./target/release/mytest --version
clap has alternative methods to not exit the program when parsing the arguments, so we can work around the problem. But I think it is still better to fix the issue in sentry if possible, since a randomly happening segmentation fault is pretty bad anyway. Or at least should add some warning in sentry-rust's document to tell people avoid exiting the process before the _guard
is clean dropped.
I’m able to reproduce, and just managed to have it segfault in gdb:
Thread 3 "reqwest-interna" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff76d8640 (LWP 82620)]
0x00007ffff7d89f40 in ?? () from /usr/lib/libcrypto.so.1.1
(gdb) bt
#0 0x00007ffff7d89f40 in ?? () from /usr/lib/libcrypto.so.1.1
#1 0x00007ffff7daeeb5 in ?? () from /usr/lib/libcrypto.so.1.1
#2 0x00007ffff7daf1b6 in OPENSSL_LH_insert () from /usr/lib/libcrypto.so.1.1
#3 0x00007ffff7d8a4ec in ERR_load_strings_const () from /usr/lib/libcrypto.so.1.1
#4 0x00007ffff7de15da in ERR_load_RSA_strings () from /usr/lib/libcrypto.so.1.1
#5 0x00007ffff7d8b808 in ?? () from /usr/lib/libcrypto.so.1.1
#6 0x00007ffff7dac86a in ?? () from /usr/lib/libcrypto.so.1.1
#7 0x00007ffff7c1918f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#8 0x00007ffff7e194ea in CRYPTO_THREAD_run_once () from /usr/lib/libcrypto.so.1.1
#9 0x00007ffff7dacdb6 in OPENSSL_init_crypto () from /usr/lib/libcrypto.so.1.1
#10 0x00007ffff7d8ab6a in ERR_get_state () from /usr/lib/libcrypto.so.1.1
#11 0x00007ffff7d8acaa in ERR_clear_error () from /usr/lib/libcrypto.so.1.1
#12 0x00007ffff7dac77a in ?? () from /usr/lib/libcrypto.so.1.1
#13 0x00007ffff7c1918f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#14 0x00007ffff7e194ea in CRYPTO_THREAD_run_once () from /usr/lib/libcrypto.so.1.1
#15 0x00007ffff7dacf94 in OPENSSL_init_crypto () from /usr/lib/libcrypto.so.1.1
#16 0x00007ffff7d0fecb in ?? () from /usr/lib/libcrypto.so.1.1
#17 0x00007ffff7dac7f1 in ?? () from /usr/lib/libcrypto.so.1.1
#18 0x00007ffff7c1918f in __pthread_once_slow () from /usr/lib/libpthread.so.0
#19 0x00007ffff7e194ea in CRYPTO_THREAD_run_once () from /usr/lib/libcrypto.so.1.1
#20 0x00007ffff7dacefb in OPENSSL_init_crypto () from /usr/lib/libcrypto.so.1.1
#21 0x00007ffff7f42202 in OPENSSL_init_ssl () from /usr/lib/libssl.so.1.1
#22 0x0000555555b8f490 in openssl_sys::init::{{closure}} () at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-sys-0.9.58/src/lib.rs:105
#23 0x0000555555b8f630 in std::sync::once::Once::call_once::{{closure}} () at /nosnap/dot-rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/sync/once.rs:264
#24 0x0000555555ecb81a in std::sync::once::Once::call_inner () at src/libstd/sync/once.rs:416
#25 0x0000555555b8f5a9 in std::sync::once::Once::call_once (self=0x555556249330 <openssl_sys::init::INIT>, f=...)
at /nosnap/dot-rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/sync/once.rs:264
#26 0x0000555555b8f471 in openssl_sys::init () at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-sys-0.9.58/src/lib.rs:104
#27 0x0000555555b7d01f in openssl::ssl::SslContextBuilder::new (method=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-0.10.30/src/ssl/mod.rs:661
#28 0x0000555555b8785d in openssl::ssl::connector::ctx (method=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-0.10.30/src/ssl/connector.rs:25
#29 0x0000555555b87ae9 in openssl::ssl::connector::SslConnector::builder (method=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-0.10.30/src/ssl/connector.rs:67
#30 0x0000555555b72654 in native_tls::imp::TlsConnector::new (builder=0x7ffff76d2c60) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/native-tls-0.2.4/src/imp/openssl.rs:257
#31 0x0000555555b73758 in native_tls::TlsConnectorBuilder::build (self=0x7ffff76d2c60) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/native-tls-0.2.4/src/lib.rs:414
#32 0x0000555555ad6c98 in reqwest::connect::Connector::new_default_tls (http=..., tls=..., proxies=..., user_agent=..., local_addr=..., nodelay=true)
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/reqwest-0.10.8/src/connect.rs:166
#33 0x0000555555af4e95 in reqwest::async_impl::client::ClientBuilder::build (self=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/reqwest-0.10.8/src/async_impl/client.rs:215
#34 0x0000555555a0c9d5 in reqwest::blocking::client::ClientHandle::new::{{closure}}::{{closure}} ()
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/reqwest-0.10.8/src/blocking/client.rs:735
#35 0x00005555559e86ea in <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll (self=..., cx=0x7ffff76d4550)
at /nosnap/dot-rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/future/mod.rs:78
#36 0x0000555555a97b85 in tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on::{{closure}}::{{closure}} ()
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/basic_scheduler.rs:131
#37 0x00005555559ae7fb in tokio::coop::with_budget::{{closure}} (cell=0x7ffff76d82c2) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/coop.rs:127
#38 0x0000555555a86397 in std::thread::local::LocalKey<T>::try_with (self=0x5555561cde40, f=...)
at /nosnap/dot-rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/thread/local.rs:263
#39 0x0000555555a85cce in std::thread::local::LocalKey<T>::with (self=0x5555561cde40, f=...)
at /nosnap/dot-rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/thread/local.rs:239
--Type <RET> for more, q to quit, c to continue without paging--
#40 0x0000555555a97459 in tokio::coop::with_budget (budget=..., f=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/coop.rs:120
#41 tokio::coop::budget (f=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/coop.rs:96
#42 tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on::{{closure}} (scheduler=0x7ffff76d5d68, context=0x7ffff76d4cd0)
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/basic_scheduler.rs:131
#43 0x0000555555a97f61 in tokio::runtime::basic_scheduler::enter::{{closure}} () at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/basic_scheduler.rs:213
#44 0x0000555555b04ff6 in tokio::macros::scoped_tls::ScopedKey<T>::set (self=0x5555561e3230 <tokio::runtime::basic_scheduler::CURRENT>, t=0x7ffff76d4cd0, f=...)
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/macros/scoped_tls.rs:63
#45 0x0000555555a97e36 in tokio::runtime::basic_scheduler::enter (scheduler=0x7ffff76d5d68, f=...)
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/basic_scheduler.rs:213
#46 0x0000555555a971ce in tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on (self=0x7ffff76d5d68, future=...)
at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/basic_scheduler.rs:123
#47 0x00005555559c3f4e in tokio::runtime::Runtime::block_on::{{closure}} () at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/mod.rs:444
#48 0x00005555559c3feb in tokio::runtime::context::enter (new=..., f=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/context.rs:72
#49 0x00005555559f537e in tokio::runtime::handle::Handle::enter (self=0x7ffff76d5e08, f=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/handle.rs:76
#50 0x00005555559c3dfe in tokio::runtime::Runtime::block_on (self=0x7ffff76d5d60, future=...) at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.21/src/runtime/mod.rs:441
#51 0x0000555555a0e1aa in reqwest::blocking::client::ClientHandle::new::{{closure}} () at /home/swatinem/.cargo/registry/src/github.com-1ecc6299db9ec823/reqwest-0.10.8/src/blocking/client.rs:760
#52 0x0000555555afd745 in std::sys_common::backtrace::__rust_begin_short_backtrace (f=...)
So… looks like the crash comes from openssl, somewhere inside reqwest. I will play around with this a bit and see if I can reproduce this in reqwest directly.
Sentry-rust occasionally causes Segmentation fault when our program exits. Here is a minimal code example to reproduce it.
To reproduce
Create a bin project with
cargo new
.Cargo.toml
:src/main.rs
:Build the project with
cargo build --release
(debug is also fine), and run the binary repeated:My environment is:
Some findings:
"https://key@sentry.io/42"
tosentry::init()
, otherwise sentry-rust is disabled, nothing will happen. But I haven't tried using a real Sentry sever yet. Probably one condition of triggering this is to have a fake server URL, or the server is down.std::process::exit(0)
is needed in the example code to see the segfault. But in our real code from where we noticed this issue, the segfault can still happen without thestd::process::exit(0)
. Our real code is doing some real work before exit, so the time before exit is different from the example code. I guess thestd::process::exit(0)
is not a cause of the segfault, but just a condition to increase of the chance of happening.std::process::exit(0)
is needed to trigger the segfault, see comment below.