yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.91k stars 1.06k forks source link

New operation's hybrid time too low #6709

Closed amitanandaiyer closed 1 year ago

amitanandaiyer commented 3 years ago

Jira Link: DB-3973

 1 F20201218 22:49:05 ../../src/yb/tablet/mvcc.cc:398] T 6f048e1ca45f4dc4a90f5b8d037b42b2 P 45bedb8e80604419a75516a72fa8353a: T 6f048e1ca45f4dc4a90f5b8d037b42b2 P 45bedb8e80604419a75516a72fa8353a: Recent 32 MVCC operations:
  2 1. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331735773824 } }
  3 2. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331736220058 } }
  4 3. AddPending { provided_ht: { physical: 1608331736421545 } final_ht: { physical: 1608331736421545 } }
  5 4. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331736503719 } }
  6 5. Replicated { ht: { physical: 1608331736421545 } }
  7 6. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331736503719 } source: kPropagated } }
  8 7. AddPending { provided_ht: { physical: 1608331736671692 } final_ht: { physical: 1608331736671692 } }
  9 8. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331736671691 logical: 4095 } }
 10 9. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331736671691 logical: 4095 } source: kPropagated } }
 11 10. Replicated { ht: { physical: 1608331736671692 } }
 12 11. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331736709356 } }
 13 12. LastReplicatedHybridTime { last_replicated: { physical: 1608331736671692 } }
 14 13. AddPending { provided_ht: { physical: 1608331736958836 } final_ht: { physical: 1608331736958836 } }
 15 14. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331736958835 logical: 4095 } }
 16 15. Replicated { ht: { physical: 1608331736958836 } }
 17 16. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331737062933 } }
 18 17. LastReplicatedHybridTime { last_replicated: { physical: 1608331736958836 } }
 19 18. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331737608338 } }
 20 19. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331738123057 } }
 21 20. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738123057 } source: kPropagated } }
 22 21. AddPending { provided_ht: { physical: 1608331738548381 } final_ht: { physical: 1608331738548381 } }
 23 22. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331738548380 logical: 4095 } }
 24 23. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738548380 logical: 4095 } source: kPropagated } }
 25 24. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738548380 logical: 4095 } source: kPropagated } }
 26 25. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738548380 logical: 4095 } source: kPropagated } }
 27 26. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738548380 logical: 4095 } source: kPropagated } }
 28 27. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331738548380 logical: 4095 } source: kPropagated } }
 29 28. Replicated { ht: { physical: 1608331738548381 } }
 30 29. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331739862483 } }
 31 30. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331739862483 } source: kPropagated } }
 32 31. LastReplicatedHybridTime { last_replicated: { physical: 1608331738548381 } }
 33 32. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331739862483 } source: kPropagated } }
 34 New operation's hybrid time too low: { physical: 1608331738640210 }
 35   max_safe_time_returned_with_lease_={ safe_time: <min> source: kUnknown }
 36   *ht < max_safe_time_returned_with_lease_.safe_time=0
 37   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_with_lease_.safe_time.ToUint64())=6587726801470300160
 38   ht->PhysicalDiff(max_safe_time_returned_with_lease_.safe_time)=1608331738640210
 39
 40   max_safe_time_returned_without_lease_={ safe_time: <min> source: kUnknown }
 41   *ht < max_safe_time_returned_without_lease_.safe_time=0
 42   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_without_lease_.safe_time.ToUint64())=6587726801470300160
 43   ht->PhysicalDiff(max_safe_time_returned_without_lease_.safe_time)=1608331738640210
 44
 45   max_safe_time_returned_for_follower_={ safe_time: { physical: 1608331739862483 } source: kPropagated }
 46   *ht < max_safe_time_returned_for_follower_.safe_time=1
 47   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_for_follower_.safe_time.ToUint64())=-5006430208
 48   ht->PhysicalDiff(max_safe_time_returned_for_follower_.safe_time)=-1222273
 49
 50   (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown})={ safe_time: { physical: 1608331738548381 } source: kUnknown }
 51   *ht < (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time=0
 52   static_cast<int64_t>(ht->ToUint64() - (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time.ToUint64())=376131584
 53   ht->PhysicalDiff((SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time)=91829
 54
 55   (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown})={ safe_time: <min> source: kUnknown }
 56   *ht < (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time=0
 57   static_cast<int64_t>(ht->ToUint64() - (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time.ToUint64())=6587726801470300160
 58   ht->PhysicalDiff((SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time)=1608331738640210
 59
 60   is_follower_side=1
 61   queue_.size()=0
 62   queue_=[]
 63   aborted=[]
 64     @     0x7effb104deb3  yb::LogFatalHandlerSink::send(int, char const*, char const*, int, tm const*, char const*, unsigned long) (src/yb/util/logging.cc:474)
 65     @     0x7effb0222865
 66     @     0x7effb021fe39
 67     @     0x7effb0223528
 68     @     0x7effb9967451  yb::tablet::MvccManager::AddPending(yb::HybridTime*) (src/yb/tablet/mvcc.cc:398)
 69     @     0x7effb98eb71e  yb::tablet::TabletPeer::StartReplicaOperation(scoped_refptr<yb::consensus::ConsensusRound> const&, yb::HybridTime) (src/yb/tablet/tablet_peer.cc:1039)
 70     @     0x7effb9551291  yb::consensus::RaftConsensus::StartReplicaOperationUnlocked(shared_ptr<yb::consensus::ReplicateMsg> const&, yb::HybridTime) (src/yb/consensus/raft_consensus.cc:1460)
 71     @     0x7effb955a179  yb::consensus::RaftConsensus::EnqueuePreparesUnlocked(yb::consensus::ConsensusRequestPB const&, yb::consensus::RaftConsensus::LeaderRequest*, yb::consensus::ConsensusResponsePB*) (src/yb/consensus/raf    t_consensus.cc:1944)
 72     @     0x7effb9568454  yb::consensus::RaftConsensus::UpdateReplica(yb::consensus::ConsensusRequestPB*, yb::consensus::ConsensusResponsePB*) (src/yb/consensus/raft_consensus.cc:1850)
 73     @     0x7effb9569582  yb::consensus::RaftConsensus::Update(yb::consensus::ConsensusRequestPB*, yb::consensus::ConsensusResponsePB*, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long, std::ratio<1l, 10    00000000l> > >) (src/yb/consensus/raft_consensus.cc:1406)
 74     @     0x7effba1b5fbf  yb::tserver::ConsensusServiceImpl::UpdateConsensus(yb::consensus::ConsensusRequestPB const*, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcContext) (src/yb/tserver/tablet_service.cc:2225)
 75     @     0x7effb68f5654  yb::consensus::ConsensusServiceIf::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/consensus/consensus.service.cc:100)
 76     @     0x7effb27d60e4  yb::rpc::ServicePoolImpl::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/rpc/service_pool.cc:262)
 77     @     0x7effb2773964  yb::rpc::InboundCall::InboundCallTask::Run() (src/yb/rpc/inbound_call.cc:212)
 78     @     0x7effb27e92be  Execute (src/yb/rpc/thread_pool.cc:105)
 79     @     0x7effb27e49cc  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:600)
 80     @     0x7effb27e49cc  __call<void, 0ul> (gcc/5.5.0_4/include/c++/5.5.0/functional:1074)
 81     @     0x7effb27e49cc  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:1133)
 82     @     0x7effb27e49cc  _M_invoke (gcc/5.5.0_4/include/c++/5.5.0/functional:1871)
 83     @     0x7effb10dd625  std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
 84     @     0x7effb10dd625  yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:771)
 85     @     0x7effabaa1693  start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
 86     @     0x7effab7e341c  (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
 87     @ 0xffffffffffffffff
 88

and

1 F20201218 22:50:37 ../../src/yb/tablet/mvcc.cc:398] T 145c0b5a87b04ae39851ea4df52c275e P 214bb8fd837e4a56bf3719e7272015b0: T 145c0b5a87b04ae39851ea4df52c275e P 214bb8fd837e4a56bf3719e7272015b0: Recent 23 MVCC operations:
  2 1. SetLeaderOnlyMode { leader_only: 0 }
  3 2. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331826111903 } }
  4 3. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331827440729 } }
  5 4. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331829121521 } }
  6 5. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331830549001 } }
  7 6. AddPending { provided_ht: { physical: 1608331830725840 } final_ht: { physical: 1608331830725840 } }
  8 7. AddPending { provided_ht: { physical: 1608331830725859 } final_ht: { physical: 1608331830725859 } }
  9 8. AddPending { provided_ht: { physical: 1608331830725864 } final_ht: { physical: 1608331830725864 } }
 10 9. AddPending { provided_ht: { physical: 1608331831157278 } final_ht: { physical: 1608331831157278 } }
 11 10. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331830725839 logical: 4095 } }
 12 11. Replicated { ht: { physical: 1608331830725840 } }
 13 12. Replicated { ht: { physical: 1608331830725859 } }
 14 13. Replicated { ht: { physical: 1608331830725864 } }
 15 14. Replicated { ht: { physical: 1608331831157278 } }
 16 15. AddPending { provided_ht: { physical: 1608331831901044 } final_ht: { physical: 1608331831901044 } }
 17 16. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331831901043 logical: 4095 } }
 18 17. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331831901043 logical: 4095 } source: kPropagated } }
 19 18. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331831901043 logical: 4095 } source: kPropagated } }
 20 19. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331831901043 logical: 4095 } source: kPropagated } }
 21 20. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331831901043 logical: 4095 } source: kPropagated } }
 22 21. Replicated { ht: { physical: 1608331831901044 } }
 23 22. SetPropagatedSafeTimeOnFollower { ht: { physical: 1608331833699777 } }
 24 23. SafeTimeForFollower { min_allowed: <min> deadline: <uninitialized> safe_time_with_source: { safe_time: { physical: 1608331833699777 } source: kPropagated } }
 25 New operation's hybrid time too low: { physical: 1608331832322946 }
 26   max_safe_time_returned_with_lease_={ safe_time: <min> source: kUnknown }
 27   *ht < max_safe_time_returned_with_lease_.safe_time=0
 28   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_with_lease_.safe_time.ToUint64())=6587727185194786816
 29   ht->PhysicalDiff(max_safe_time_returned_with_lease_.safe_time)=1608331832322946
 30
 31   max_safe_time_returned_without_lease_={ safe_time: <min> source: kUnknown }
 32   *ht < max_safe_time_returned_without_lease_.safe_time=0
 33   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_without_lease_.safe_time.ToUint64())=6587727185194786816
 34   ht->PhysicalDiff(max_safe_time_returned_without_lease_.safe_time)=1608331832322946
 35
 36   max_safe_time_returned_for_follower_={ safe_time: { physical: 1608331833699777 } source: kPropagated }
 37   *ht < max_safe_time_returned_for_follower_.safe_time=1
 38   static_cast<int64_t>(ht->ToUint64() - max_safe_time_returned_for_follower_.safe_time.ToUint64())=-5639499776
 39   ht->PhysicalDiff(max_safe_time_returned_for_follower_.safe_time)=-1376831
 40
 41   (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown})={ safe_time: { physical: 1608331831901044 } source: kUnknown }
 42   *ht < (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time=0
 43   static_cast<int64_t>(ht->ToUint64() - (SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time.ToUint64())=1728110592
 44   ht->PhysicalDiff((SafeTimeWithSource{last_replicated_, SafeTimeSource::kUnknown}).safe_time)=421902
 45
 46   (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown})={ safe_time: <min> source: kUnknown }
 47   *ht < (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time=0
 48   static_cast<int64_t>(ht->ToUint64() - (SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time.ToUint64())=6587727185194786816
 49   ht->PhysicalDiff((SafeTimeWithSource{last_ht_in_queue, SafeTimeSource::kUnknown}).safe_time)=1608331832322946
 50
 51   is_follower_side=1
 52   queue_.size()=0
 53   queue_=[]
 54   aborted=[]
 55     @     0x7f6f9cc3aeb3  yb::LogFatalHandlerSink::send(int, char const*, char const*, int, tm const*, char const*, unsigned long) (src/yb/util/logging.cc:474)
 56     @     0x7f6f9be0f865
 57     @     0x7f6f9be0ce39
 58     @     0x7f6f9be10528
 59     @     0x7f6fa5554451  yb::tablet::MvccManager::AddPending(yb::HybridTime*) (src/yb/tablet/mvcc.cc:398)
 60     @     0x7f6fa54d871e  yb::tablet::TabletPeer::StartReplicaOperation(scoped_refptr<yb::consensus::ConsensusRound> const&, yb::HybridTime) (src/yb/tablet/tablet_peer.cc:1039)
 61     @     0x7f6fa513e291  yb::consensus::RaftConsensus::StartReplicaOperationUnlocked(shared_ptr<yb::consensus::ReplicateMsg> const&, yb::HybridTime) (src/yb/consensus/raft_consensus.cc:1460)
 62     @     0x7f6fa5147179  yb::consensus::RaftConsensus::EnqueuePreparesUnlocked(yb::consensus::ConsensusRequestPB const&, yb::consensus::RaftConsensus::LeaderRequest*, yb::consensus::ConsensusResponsePB*) (src/yb/consensus/raf    t_consensus.cc:1944)
 63     @     0x7f6fa5155454  yb::consensus::RaftConsensus::UpdateReplica(yb::consensus::ConsensusRequestPB*, yb::consensus::ConsensusResponsePB*) (src/yb/consensus/raft_consensus.cc:1850)
 64     @     0x7f6fa5156582  yb::consensus::RaftConsensus::Update(yb::consensus::ConsensusRequestPB*, yb::consensus::ConsensusResponsePB*, std::chrono::time_point<yb::CoarseMonoClock, std::chrono::duration<long, std::ratio<1l, 10    00000000l> > >) (src/yb/consensus/raft_consensus.cc:1406)
 65     @     0x7f6fa5da2fbf  yb::tserver::ConsensusServiceImpl::UpdateConsensus(yb::consensus::ConsensusRequestPB const*, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcContext) (src/yb/tserver/tablet_service.cc:2225)
 66     @     0x7f6fa24e2654  yb::consensus::ConsensusServiceIf::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/consensus/consensus.service.cc:100)
 67     @     0x7f6f9e3c30e4  yb::rpc::ServicePoolImpl::Handle(shared_ptr<yb::rpc::InboundCall>) (src/yb/rpc/service_pool.cc:262)
 68     @     0x7f6f9e360964  yb::rpc::InboundCall::InboundCallTask::Run() (src/yb/rpc/inbound_call.cc:212)
 69     @     0x7f6f9e3d62be  Execute (src/yb/rpc/thread_pool.cc:105)
 70     @     0x7f6f9e3d19cc  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:600)
 71     @     0x7f6f9e3d19cc  __call<void, 0ul> (gcc/5.5.0_4/include/c++/5.5.0/functional:1074)
 72     @     0x7f6f9e3d19cc  operator()<, void> (gcc/5.5.0_4/include/c++/5.5.0/functional:1133)
 73     @     0x7f6f9e3d19cc  _M_invoke (gcc/5.5.0_4/include/c++/5.5.0/functional:1871)
 74     @     0x7f6f9ccca625  std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
 75     @     0x7f6f9ccca625  yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:771)
 76     @     0x7f6f9768e693  start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
 77     @     0x7f6f973d041c  (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
 78     @ 0xffffffffffffffff
 79

while running /net/dev-server-amitanand2/share/code/yugabyte-db/yb_build.sh --cxx-test integration-tests_cassandra_cpp_driver-test --gtest_filter CppCassandraDriverTest.TestTableCreateIndexSlowAlters -n 20 --tp 1 on D8946 (https://phabricator.dev.yugabyte.com/D8946?id=48717) with

23:03 $ export | grep YB_EX
declare -x YB_EXTRA_MASTER_FLAGS="--enable_tracing=false"
declare -x YB_EXTRA_TSERVER_FLAGS=" --allow_compatible_schema_versions=true"

4 out of 20 ran into this.

bmatican commented 3 years ago

We have a couple of tasks around this area

https://github.com/yugabyte/yugabyte-db/issues/5924 https://github.com/yugabyte/yugabyte-db/issues/4086

might be a couple of different paths that can lead to this

bmatican commented 1 year ago

Closing as this is pretty old. Might have been fixed as part of @Huqicheng 's https://github.com/yugabyte/yugabyte-db/commit/a4d02ef6c65885e227eb1c61a4d8c81cbcf39d3c