neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.33k stars 409 forks source link

storcon: `spawn_heartbeat_driver` panic on shutdown (flakiness in test `test_timeline_detach_ancestor_interrupted_by_deletion`) #8766

Closed problame closed 3 weeks ago

problame commented 3 weeks ago

https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8462/10455319121/index.html#suites/a1c2be32556270764423c495fad75d47/4c1334ccbff664e5/

2024-08-19T14:57:32.101715Z  INFO Terminating on signal
2024-08-19T14:57:32.101950Z  INFO Joined HTTP server task
2024-08-19T14:57:32.101954Z  INFO Shutting down: cancelling and waiting for in-flight reconciles
2024-08-19T14:57:32.101959Z  INFO Shutting down: processing results from previously in-flight reconciles
2024-08-19T14:57:32.101975Z  INFO Shutting down: cancelling and waiting for background tasks to exit
2024-08-19T14:57:32.102041Z ERROR spawn_heartbeat_driver:panic{thread=main location=/__w/neon/neon/storage_controller/src/heartbeater.rs:92:24}: called `Result::unwrap()` on an `Err` value: RecvError(())

Stack backtrace:
   0: utils::logging::tracing_panic_hook
             at /__w/neon/neon/libs/utils/src/logging.rs:206:21
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/alloc/src/boxed.rs:2077:9
   2: storage_controller::main::{{closure}}
             at /__w/neon/neon/storage_controller/src/main.rs:214:9
   3: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/alloc/src/boxed.rs:2077:9
   4: std::panicking::rust_panic_with_hook
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:799:13
   5: std::panicking::begin_panic_handler::{{closure}}
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:664:13
   6: std::sys_common::backtrace::__rust_end_short_backtrace
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/sys_common/backtrace.rs:171:18
   7: rust_begin_unwind
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:652:5
   8: core::panicking::panic_fmt
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panicking.rs:72:14
   9: core::result::unwrap_failed
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/result.rs:1679:5
  10: core::result::Result<T,E>::unwrap
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/result.rs:1102:23
  11: storage_controller::heartbeater::Heartbeater::heartbeat::{{closure}}
             at /__w/neon/neon/storage_controller/src/heartbeater.rs:92:9
  12: storage_controller::service::Service::spawn_heartbeat_driver::{{closure}}::{{closure}}
             at /__w/neon/neon/storage_controller/src/service.rs:927:57
  13: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
  14: storage_controller::service::Service::spawn_heartbeat_driver::{{closure}}
             at /__w/neon/neon/storage_controller/src/service.rs:909:5
  15: storage_controller::service::Service::spawn::{{closure}}::{{closure}}
             at /__w/neon/neon/storage_controller/src/service.rs:1426:47
  16: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/core.rs:328:17
  17: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/loom/std/unsafe_cell.rs:16:9
  18: tokio::runtime::task::core::Core<T,S>::poll
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/core.rs:317:30
  19: tokio::runtime::task::harness::poll_future::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/harness.rs:485:19
  20: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panic/unwind_safe.rs:272:9
  21: std::panicking::try::do_call
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:559:40
  22: std::panicking::try
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:523:19
  23: std::panic::catch_unwind
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panic.rs:149:14
  24: tokio::runtime::task::harness::poll_future
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/harness.rs:473:18
  25: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/harness.rs:208:27
  26: tokio::runtime::task::harness::Harness<T,S>::poll
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/harness.rs:153:15
  27: tokio::runtime::task::raw::poll
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/raw.rs:271:5
  28: tokio::runtime::task::LocalNotified<S>::run
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/task/mod.rs:427:9
  29: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:700:30
  30: tokio::runtime::coop::with_budget
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/coop.rs:107:5
  31: tokio::runtime::coop::budget
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/coop.rs:73:5
  32: tokio::runtime::scheduler::current_thread::Context::run_task::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:343:43
  33: tokio::runtime::scheduler::current_thread::Context::enter
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:404:19
  34: tokio::runtime::scheduler::current_thread::Context::run_task
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:343:23
  35: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:699:43
  36: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:737:68
  37: tokio::runtime::context::scoped::Scoped<T>::set
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/context/scoped.rs:40:9
  38: tokio::runtime::context::set_scheduler::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/context.rs:176:26
  39: std::thread::local::LocalKey<T>::try_with
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/thread/local.rs:283:12
  40: std::thread::local::LocalKey<T>::with
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/thread/local.rs:260:9
  41: tokio::runtime::context::set_scheduler
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/context.rs:176:17
  42: tokio::runtime::scheduler::current_thread::CoreGuard::enter
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:737:27
  43: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:646:19
  44: tokio::runtime::scheduler::current_thread::CurrentThread::block_on::{{closure}}
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:175:28
  45: tokio::runtime::context::runtime::enter_runtime
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/context/runtime.rs:65:16
  46: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/scheduler/current_thread/mod.rs:167:9
  47: tokio::runtime::runtime::Runtime::block_on
             at /__w/neon/neon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.37.0/src/runtime/runtime.rs:349:47
  48: storage_controller::main
             at /__w/neon/neon/storage_controller/src/main.rs:219:5
  49: core::ops::function::FnOnce::call_once
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:250:5
  50: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/sys_common/backtrace.rs:155:18
  51: std::rt::lang_start::{{closure}}
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/rt.rs:159:18
  52: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/ops/function.rs:284:13
  53: std::panicking::try::do_call
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:559:40
  54: std::panicking::try
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:523:19
  55: std::panic::catch_unwind
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panic.rs:149:14
  56: std::rt::lang_start_internal::{{closure}}
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/rt.rs:141:48
  57: std::panicking::try::do_call
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:559:40
  58: std::panicking::try
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:523:19
  59: std::panic::catch_unwind
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panic.rs:149:14
  60: std::rt::lang_start_internal
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/rt.rs:141:20
  61: main
  62: __libc_start_main
  63: _start
koivunej commented 3 weeks ago

This is not related to the test; it's a matter of handling shutdown within the heartbeats.