tikv / tikv

Distributed transactional key-value database, originally created to complement TiDB
https://tikv.org
Apache License 2.0
15.2k stars 2.13k forks source link

v6.5.4 tikv panic for "meta corrupted: no region for 51902 7A7480000000000000FFA45F728000000000FF00192A0000000000FA " #15370

Closed seiya-annie closed 1 year ago

seiya-annie commented 1 year ago

Bug Report

What version of TiKV are you using?

v6.5.4

What operating system and CPU are you using?

Steps to reproduce

async commit Scene, inject neterr fail workload : long-fork

      - name: TIKV_CONFIG
        value: |-
          [storage]
          reserve-space = 0
          enable-async-apply-prewrite = true
          [coprocessor]
          region-max-keys = 5
          region-split-keys = 3
          [pessimistic-txn]
          pipelined = true
          in-memory = true
          [raftstore]
          pd-heartbeat-tick-interval='5s'
          raft-store-max-leader-lease='50ms'
          raft-base-tick-interval='100ms'
          raft-heartbeat-ticks=3
          raft-election-timeout-ticks=10
      - name: TIDB_CONFIG
        value: ""

      - name: TEST_ARGS
        value: --concurrency=3n
          --tarball-url=http://minio.pingcap.net:9000/tp-team/tests/jepsen/tidb-v6.5.4-pre.tar.gz
          --txn-mode=optimistic --force-reinstall=true
          --nemesis=schedules,partition-pd-leader,partition-half,partition-ring
          --os=image --time-limit=300 --version=master --workload=long-fork
          --init-txn-sql='set @@tidb_enable_async_commit = 0, @@tidb_enable_1pc
          = 0; set @@tidb_enable_async_commit = 1, @@tidb_enable_1pc = 0; set
          @@tidb_enable_async_commit = 1, @@tidb_enable_1pc = 1' --init-sql='set
          @@tidb_enable_mutation_checker=1, @@tidb_txn_assertion_level=strict,
          @@tidb_constraint_check_in_place_pessimistic=off'

What did you expect?

What did happened?

[2023/08/20 01:43:14.538 +00:00] [FATAL] [lib.rs:509] ["meta corrupted: no region for 51902 7A7480000000000000FFA45F728000000000FF00192A0000000000FA when creating 66139 region_id: 66139 from_peer { id: 66142 store_id: 5 } to_peer { id: 66562 store_id: 1 role: Learner } message { msg_type: MsgHeartbeat to: 66562 from: 66142 term: 6 } region_epoch { conf_ver: 1640 version: 6684 } start_key: 7480000000000000FF945F728000000000FF0108CA0000000000FA end_key: 7480000000000000FF945F728000000000FF0108D20000000000FA"] [backtrace="   0: tikv_util::set_panic_hook::{{closure}}\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/lib.rs:508:18\n   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2032:9\n      std::panicking::rust_panic_with_hook\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:692:13\n   2: std::panicking::begin_panic_handler::{{closure}}\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:579:13\n   3: std::sys_common::backtrace::__rust_end_short_backtrace\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:137:18\n   4: rust_begin_unwind\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:575:5\n   5: core::panicking::panic_fmt\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:65:14\n   6: raftstore::store::fsm::store::StoreFsmDelegate<EK,ER,T>::maybe_create_peer_internal\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:2192:25\n      raftstore::store::fsm::store::StoreFsmDelegate<EK,ER,T>::maybe_create_peer\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:2117:19\n   7: raftstore::store::fsm::store::StoreFsmDelegate<EK,ER,T>::on_raft_message\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:2047:20\n      raftstore::store::fsm::store::StoreFsmDelegate<EK,ER,T>::handle_msgs\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:746:37\n   8: <raftstore::store::fsm::store::RaftPoller<EK,ER,T> as batch_system::batch::PollHandler<raftstore::store::fsm::peer::PeerFsm<EK,ER>,raftstore::store::fsm::store::StoreFsm<EK>>>::handle_control\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:919:9\n      batch_system::batch::Poller<N,C,Handler>::poll\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/batch-system/src/batch.rs:419:27\n   9: raftstore::store::worker::refresh_config::PoolController<N,C,H>::increase_by::{{closure}}\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/worker/refresh_config.rs:78:21\n      <std::thread::Builder as tikv_util::sys::thread::StdThreadBuildWrapper>::spawn_wrapper::{{closure}}\n             at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/sys/thread.rs:415:23\n      std::sys_common::backtrace::__rust_begin_short_backtrace\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:121:18\n  10: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:551:17\n      <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:271:9\n      std::panicking::try::do_call\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:483:40\n      std::panicking::try\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:447:19\n      std::panic::catch_unwind\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14\n      std::thread::Builder::spawn_unchecked_::{{closure}}\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:550:30\n      core::ops::function::FnOnce::call_once{{vtable.shim}}\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:513:5\n  11: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n      std::sys::unix::thread::Thread::new::thread_start\n             at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/thread.rs:108:17\n  12: start_thread\n  13: clone\n"] [location=/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/raftstore/src/store/fsm/store.rs:2192] [thread_name=raftstore-1-1]
seiya-annie commented 1 year ago

kv.log.gz

cfzjywxk commented 1 year ago

Duplicate with https://github.com/tikv/tikv/issues/13311