xline-kv / Xline

A geo-distributed KV store for metadata management
https://xline.cloud
Apache License 2.0
596 stars 74 forks source link

[Bug]: Xline server frozen after panic from put invalid lease #640

Closed leowmjw closed 8 months ago

leowmjw commented 8 months ago

Description about the bug

  1. Xline server does not handle gracefully kv put with invalid lease id (id not existing)
  2. It should surface the error that lease does not exist when put the value and server should NOT be frozen
  3. To reproduce:
    • etcdctl put foo1 bar60 --lease=09238d2c7bdf2300 # leaseID not exist OK # Should surface error
    • etcdctl get foo1 --prefix # hangs {"level":"warn","ts":"2024-01-29T23:04:55.729015+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001fe960/127.0.0.1:2379","method":"/etcdserverpb.KV/Range","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} Error: context deadline exceeded

Version

0.6.1 (Default)

Relevant log output

2024-01-29T15:04:48.208527Z DEBUG xline::storage::kv_store: Execute PutRequest(PutRequest { key: [102, 111, 111, 49], value: [98, 97, 114, 54, 48], lease: 658525192722522880, prev_kv: false, ignore_value: false, ignore_lease: false })

2024-01-29T15:04:48.208588Z DEBUG curp::server::cmd_worker: 14801266300423768355 cmd(17312231596942094340#11198891629512189226) is speculatively executed, exe status: true

2024-01-29T15:04:48.208671Z DEBUG xline::storage::kv_store: After Sync PutRequest(PutRequest { key: [102, 111, 111, 49], value: [98, 97, 114, 54, 48], lease: 658525192722522880, prev_kv: false, ignore_value: false, ignore_lease: false }) with revision 2

thread 'tokio-runtime-worker' panicked at 'unexpected error from lease Attach: lease 658525192722522880 not found', /Users/leow/RUST/Xline/xline/src/storage/kv_store.rs:8052024-01-29T15:04:48.208699Z DEBUG put: curp::client::unary: fast round for cmd(17312231596942094340#11198891629512189226) succeed
:37
stack backtrace:
2024-01-29T15:04:48.208914Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2024-01-29T15:04:48.208981Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1) }
2024-01-29T15:04:48.209026Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2024-01-29T15:04:48.209317Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 31 }
2024-01-29T15:04:48.209378Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Ping { ack: false, payload: [2, 4, 16, 16, 9, 14, 7, 7] }
2024-01-29T15:04:48.209421Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Ping { ack: true, payload: [2, 4, 16, 16, 9, 14, 7, 7] }
2024-01-29T15:04:48.209862Z DEBUG Connection{peer=Server}: h2::proto::connection: Connection::poll; IO error error=ConnectionReset
2024-01-29T15:04:48.209917Z DEBUG hyper::server::server::new_svc: connection error: connection error: connection reset
   0: rust_begin_unwind
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14
   2: xline::storage::kv_store::KvStore<DB>::sync_put_request::{{closure}}
             at /Users/leow/RUST/Xline/xline/src/storage/kv_store.rs:805:37
   3: core::result::Result<T,E>::unwrap_or_else
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/result.rs:1428:23
   4: xline::storage::kv_store::KvStore<DB>::sync_put_request
             at /Users/leow/RUST/Xline/xline/src/storage/kv_store.rs:804:13
   5: xline::storage::kv_store::KvStore<DB>::sync_request::{{closure}}
             at /Users/leow/RUST/Xline/xline/src/storage/kv_store.rs:683:52
   6: xline::storage::kv_store::KvStore<DB>::after_sync::{{closure}}
             at /Users/leow/RUST/Xline/xline/src/storage/kv_store.rs:207:14
   7: <xline::server::command::CommandExecutor<S> as curp_external_api::cmd::CommandExecutor<xlineapi::command::Command>>::after_sync::{{closure}}
             at /Users/leow/RUST/Xline/xline/src/server/command.rs:354:81
   8: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/future/future.rs:125:9
   9: curp::server::cmd_worker::worker_as::{{closure}}
             at /Users/leow/RUST/Xline/curp/src/server/cmd_worker/mod.rs:144:73
  10: curp::server::cmd_worker::cmd_worker::{{closure}}
             at /Users/leow/RUST/Xline/curp/src/server/cmd_worker/mod.rs:73:71
  11: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:328:17
  12: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/loom/std/unsafe_cell.rs:16:9
  13: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:317:13
  14: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:485:19
  15: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panic/unwind_safe.rs:271:9
  16: std::panicking::try::do_call
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:500:40
  17: ___rust_try
  18: std::panicking::try
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:464:19
  19: std::panic::catch_unwind
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panic.rs:142:14
  20: tokio::runtime::task::harness::poll_future
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:473:18
  21: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:208:27
  22: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:153:15
  23: tokio::runtime::task::raw::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:271:5
  24: tokio::runtime::task::raw::RawTask::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201:18
  25: tokio::runtime::task::LocalNotified<S>::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/mod.rs:416:9
  26: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:639:17
  27: tokio::runtime::coop::with_budget
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107:5
  28: tokio::runtime::coop::budget
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73:5
  29: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:575:9
  30: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:538:24
  31: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:491:21
  32: tokio::runtime::context::scoped::Scoped<T>::set
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/scoped.rs:40:9
  33: tokio::runtime::context::set_scheduler::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176:26
  34: std::thread::local::LocalKey<T>::try_with
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/thread/local.rs:270:16
  35: std::thread::local::LocalKey<T>::with
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/thread/local.rs:246:9
  36: tokio::runtime::context::set_scheduler
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context.rs:176:9
  37: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:486:9
  38: tokio::runtime::context::runtime::enter_runtime
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65:16
  39: tokio::runtime::scheduler::multi_thread::worker::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:478:5
  40: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/worker.rs:447:45
  41: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/task.rs:42:21
  42: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:328:17
  43: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/loom/std/unsafe_cell.rs:16:9
  44: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/core.rs:317:13
  45: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:485:19
  46: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panic/unwind_safe.rs:271:9
  47: std::panicking::try::do_call
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:500:40
  48: ___rust_try
  49: std::panicking::try
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:464:19
  50: std::panic::catch_unwind
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panic.rs:142:14
  51: tokio::runtime::task::harness::poll_future
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:473:18
  52: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:208:27
  53: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/harness.rs:153:15
  54: tokio::runtime::task::raw::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:271:5
  55: tokio::runtime::task::raw::RawTask::poll
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/raw.rs:201:18
  56: tokio::runtime::task::UnownedTask<S>::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/task/mod.rs:453:9
  57: tokio::runtime::blocking::pool::Task::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:159:9
  58: tokio::runtime::blocking::pool::Inner::run
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:513:17
  59: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /Users/leow/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/blocking/pool.rs:471:13
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
2024-01-29T15:04:50.730155Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384, max_header_list_size: 16777216 }
2024-01-29T15:04:50.730356Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x0) }
2024-01-29T15:04:50.730410Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2024-01-29T15:04:50.730457Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2024-01-29T15:04:50.730490Z DEBUG Connection{peer=Server}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384, max_header_list_size: 16777216 }
2024-01-29T15:04:50.730878Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2024-01-29T15:04:50.730971Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2024-01-29T15:04:50.731022Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2024-01-29T15:04:50.731267Z DEBUG range: xline::server::kv_server: Receive grpc request: RangeRequest { key: [102, 111, 111, 49], range_end: [102, 111, 111, 50], limit: 0, revision: 0, sort_order: None, sort_target: Key, serializable: false, keys_only: false, count_only: false, min_mod_revision: 0, max_mod_revision: 0, min_create_revision: 0, max_create_revision: 0 }
2024-01-29T15:04:50.733971Z DEBUG range: curp::client::unary: fetch local cluster FetchClusterResponse { leader_id: Some(14801266300423768355), term: 1, cluster_id: 11958423767574201339, members: [Member { id: 14801266300423768355, name: "node1", addrs: ["127.0.0.1:2379"], is_learner: false }], cluster_version: 0 }
2024-01-29T15:04:50.734131Z DEBUG range: xline::server::kv_server: Range wait for command ids id_set=IdSet { inflight_ids: [184526593445901294] }
2024-01-29T15:04:52.736840Z DEBUG range: curp::client::unary: fetch local cluster FetchClusterResponse { leader_id: Some(14801266300423768355), term: 1, cluster_id: 11958423767574201339, members: [Member { id: 14801266300423768355, name: "node1", addrs: ["127.0.0.1:2379"], is_learner: false }], cluster_version: 0 }
2024-01-29T15:04:52.737478Z DEBUG range: xline::server::kv_server: Range wait for command ids id_set=IdSet { inflight_ids: [184526593445901294] }
...

Code of Conduct

Phoenix500526 commented 8 months ago

Thanks for your feedback, we'll handle it ASAP.

themanforfree commented 8 months ago

Fixed in #646