redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.58k stars 582 forks source link

Latency spikes associated with rm_stm compact #11739

Closed travisdowns closed 7 months ago

travisdowns commented 1 year ago

Version & Environment

Redpanda version: 23.1.11

What went wrong?

Latency spike is observed in concert with active producer IDs surpassing 1000 on a broker:

https://vectorizedio.grafana.net/d/redpanda-prod-v2/redpanda-clusters-v2?orgId=1&var-datasource=metrics-prod-cloudv2&var-redpanda_id=chiba8rosj7gvl84bk3g&var-node=All&var-node_shard=All&var-aggr_criteria=pod&from=1687770918810&to=1687773056033

Produce IDs go over 1000:

image

Pending tasks spike:

image

Reactor utilization hits 100%:

image

The same effect is observed on several nodes but not all nodes.

Completed tasks drop by amount half, e.g., on shard 0 from 240K/s to 122K/s, along with the increase in reactor utilization to 100% indicates that the average task length has more than doubled in length.

image

There were several reactor stalls during the spike, which all pointed to cluster::rm_stm::compact_snapshot(), usually in the std::sort of the timestamps.

Here's an example:

2023-06-26T09:30:49Z    REDPANDA     redpanda               Reactor stalled for 37 ms on shard 1. Backtrace:
[Backtrace #29]
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x606ea7f: void seastar::backtrace<seastar::backtrace_buffer::append_backtrace_oneline()::'lambda'(seastar::frame)>(seastar::backtrace_buffer::append_backtrace_oneline()::'lambda'(seastar::frame)&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace_oneline() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:768
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:787
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x606e792: seastar::internal::cpu_stall_detector::generate_trace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1346
 (inlined by) seastar::internal::cpu_stall_detector::maybe_report() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1088
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1105
 (inlined by) seastar::reactor::block_notifier(int) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1329
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x42abf: ?? at ??:0
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e223de: fragmented_vector<long, 8192ul>::iter<false>::operator*() const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/utils/fragmented_vector.h:208
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e2200c: void std::__1::__introsort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&, std::__1::iterator_traits<fragmented_vector<long, 8192ul>::iter<false>>::difference_type) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:414
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e21fc1: void std::__1::__introsort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&, std::__1::iterator_traits<fragmented_vector<long, 8192ul>::iter<false>>::difference_type) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:457
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e21fc1: void std::__1::__introsort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&, std::__1::iterator_traits<fragmented_vector<long, 8192ul>::iter<false>>::difference_type) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:457
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e21fc1: void std::__1::__introsort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&, std::__1::iterator_traits<fragmented_vector<long, 8192ul>::iter<false>>::difference_type) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:457
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4e21c94: void std::__1::__introsort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&, std::__1::iterator_traits<fragmented_vector<long, 8192ul>::iter<false>>::difference_type) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:462
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4d3fa6d: void std::__1::__sort<std::__1::__less<long, long>&, fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>&) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:482
 (inlined by) void std::__1::sort<fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>, std::__1::__less<long, long>) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:542
 (inlined by) void std::__1::sort<fragmented_vector<long, 8192ul>::iter<false>>(fragmented_vector<long, 8192ul>::iter<false>, fragmented_vector<long, 8192ul>::iter<false>) at /vectorized/llvm/bin/../include/c++/v1/__algorithm/sort.h:551
 (inlined by) cluster::rm_stm::compact_snapshot() at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/cluster/rm_stm.cc:1915
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x4d4a140: cluster::rm_stm::apply(model::record_batch) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/cluster/rm_stm.cc:2301
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x27eee6a: raft::state_machine::batch_applicator::operator()(model::record_batch) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/raft/state_machine.cc:83
 (inlined by) auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)::operator()(raft::state_machine::batch_applicator&) const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:141
 (inlined by) auto model::record_batch_reader::impl::do_action<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&)::'lambda'()::operator()() const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:154
 (inlined by) seastar::future<seastar::bool_class<seastar::stop_iteration_tag>> seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag>>>::invoke<auto model::record_batch_reader::impl::do_action<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&)::'lambda'()&>(raft::state_machine::batch_applicator&&) at /vectorized/include/seastar/core/future.hh:2149
 (inlined by) seastar::future<void> seastar::repeat<auto model::record_batch_reader::impl::do_action<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&)::'lambda'()>(raft::state_machine::batch_applicator&&) at /vectorized/include/seastar/core/loop.hh:120
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x27edffd: auto model::record_batch_reader::impl::do_action<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>, auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:149
 (inlined by) auto model::record_batch_reader::impl::do_consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator&, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:140
 (inlined by) auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)::operator()(raft::state_machine::batch_applicator&) const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:97
 (inlined by) decltype(static_cast<raft::state_machine::batch_applicator>(fp)(static_cast<raft::state_machine::batch_applicator&>(fp0))) std::__1::__invoke_constexpr<auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&, raft::state_machine::batch_applicator&>(raft::state_machine::batch_applicator&&, raft::state_machine::batch_applicator&) at /vectorized/llvm/bin/../include/c++/v1/type_traits:3648
 (inlined by) decltype(auto) std::__1::__apply_tuple_impl<auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&, std::__1::tuple<raft::state_machine::batch_applicator>&, 0ul>(raft::state_machine::batch_applicator&&, std::__1::tuple<raft::state_machine::batch_applicator>&, std::__1::__tuple_indices<0ul>) at /vectorized/llvm/bin/../include/c++/v1/tuple:1595
 (inlined by) decltype(auto) std::__1::apply<auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&, std::__1::tuple<raft::state_machine::batch_applicator>&>(raft::state_machine::batch_applicator&&, std::__1::tuple<raft::state_machine::batch_applicator>&) at /vectorized/llvm/bin/../include/c++/v1/tuple:1604
 (inlined by) auto seastar::internal::do_with_impl<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&) at /vectorized/include/seastar/core/do_with.hh:96
 (inlined by) seastar::future<detail::base_named_type<long, model::model_offset_type, std::__1::integral_constant<bool, true>>> seastar::futurize<seastar::future<detail::base_named_type<long, model::model_offset_type, std::__1::integral_constant<bool, true>>>>::invoke<seastar::future<detail::base_named_type<long, model::model_offset_type, std::__1::integral_constant<bool, true>>> (*&)(raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&), raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&&, raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&) at /vectorized/include/seastar/core/future.hh:2149
 (inlined by) auto seastar::futurize_invoke<seastar::future<detail::base_named_type<long, model::model_offset_type, std::__1::integral_constant<bool, true>>> (*&)(raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&), raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&&, raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&) at /vectorized/include/seastar/core/future.hh:2180
 (inlined by) auto seastar::do_with<raft::state_machine::batch_applicator, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)>(raft::state_machine::batch_applicator&&, auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>)::'lambda'(raft::state_machine::batch_applicator&)&&) at /vectorized/include/seastar/core/do_with.hh:131
 (inlined by) auto model::record_batch_reader::impl::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:95
 (inlined by) auto model::record_batch_reader::consume<raft::state_machine::batch_applicator>(raft::state_machine::batch_applicator, std::__1::chrono::time_point<seastar::lowres_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) && at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/model/record_batch_reader.h:245
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x27ec998: raft::state_machine::apply()::$_2::operator()(model::record_batch_reader) const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-08440f6a5273faae0-1/redpanda/redpanda/src/v/raft/state_machine.cc:138
 (inlined by) decltype(static_cast<raft::state_machine::apply()::$_2&>(fp)(static_cast<model::record_batch_reader>(fp0))) std::__1::__invoke<raft::state_machine::apply()::$_2&, model::record_batch_reader>(raft::state_machine::apply()::$_2&, model::record_batch_reader&&) at /vectorized/llvm/bin/../include/c++/v1/type_traits:3640
 (inlined by) std::__1::invoke_result<raft::state_machine::apply()::$_2&, model::record_batch_reader>::type std::__1::invoke<raft::state_machine::apply()::$_2&, model::record_batch_reader>(raft::state_machine::apply()::$_2&, model::record_batch_reader&&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:93
 (inlined by) auto seastar::internal::future_invoke<raft::state_machine::apply()::$_2&, model::record_batch_reader>(raft::state_machine::apply()::$_2&, model::record_batch_reader&&) at /vectorized/include/seastar/core/future.hh:1225
 (inlined by) seastar::future<void> seastar::future<model::record_batch_reader>::then_impl_nrvo<raft::state_machine::apply()::$_2, seastar::future<void>>(raft::state_machine::apply()::$_2&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&)::operator()(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&) const::'lambda'()::operator()() const at /vectorized/include/seastar/core/future.hh:1596
 (inlined by) void seastar::futurize<seastar::future<void>>::satisfy_with_result_of<seastar::future<void> seastar::future<model::record_batch_reader>::then_impl_nrvo<raft::state_machine::apply()::$_2, seastar::future<void>>(raft::state_machine::apply()::$_2&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&)::operator()(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&) const::'lambda'()>(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&&) at /vectorized/include/seastar/core/future.hh:2134
 (inlined by) seastar::future<void> seastar::future<model::record_batch_reader>::then_impl_nrvo<raft::state_machine::apply()::$_2, seastar::future<void>>(raft::state_machine::apply()::$_2&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&)::operator()(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&) const at /vectorized/include/seastar/core/future.hh:1589
 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, raft::state_machine::apply()::$_2, seastar::future<void> seastar::future<model::record_batch_reader>::then_impl_nrvo<raft::state_machine::apply()::$_2, seastar::future<void>>(raft::state_machine::apply()::$_2&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, raft::state_machine::apply()::$_2&, seastar::future_state<model::record_batch_reader>&&), model::record_batch_reader>::run_and_dispose() at /vectorized/include/seastar/core/future.hh:781
{23.1.11-1/opt/redpanda/libexec/redpanda} 0x608ce2f: seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2330
 (inlined by) seastar::reactor::run_some_tasks() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2737

It seems like when we go over 1000 PIDs, the shortcut condition:

if (_log_state.seq_table.size() <= _seq_table_min_size) {
        return;
}

No longer applies, so we sort the timestamp vector each time. It seems to be that the mode we get into is that new producer IDs are coming in which grows the seq_table until it is say 1001 elements, then we sort all 1001 elements and kick out the oldest one, then this process repeats every time we add a new PID: sorting 1001 elements just to remove 1 element (the older PIDs may be expired, but whie the number was under 1000 they are protected from removal by the _seq_table_min_size limit.

Part of the problem is that fragmented_vector is 4x slower than std::vector when used in std::sort. Another thing is we don't need std::sort at all, we can use nth_element to find the pivot, which is much faster (around 10x depending on the details). Plausibly there are other approaches which avoid a sort-like thing entirely, but require a larger change.

What should have happened instead?

Lower CPU use.

travisdowns commented 1 year ago

Some numbers from local testing of {std::vector, frag_vector} x {std::sort, std::nth_element} combinations (look at median column):

single run iterations:    0
single run duration:      1.000s
number of runs:           1
number of cores:          12
random seed:              866896499

test                                      iterations      median         mad         min         max      allocs       tasks        inst
containers.std_vec_sort                     23600000    36.695ns     0.000ns    36.695ns    36.695ns       0.000       0.000        90.5
containers.frag_vec_sort                     7600000   125.520ns     0.000ns   125.520ns   125.520ns       0.000       0.000       999.1
containers.std_vec_nth_mid                 118000000     4.199ns     0.000ns     4.199ns     4.199ns       0.000       0.000         7.1
containers.frag_vec_nth_mid                 68800000    10.005ns     0.000ns    10.005ns    10.005ns       0.000       0.000        63.7
containers.std_vec_nth_end                  84400000     7.299ns     0.000ns     7.299ns     7.299ns       0.000       0.000        13.6
containers.frag_vec_nth_end                 40800000    18.965ns     0.000ns    18.965ns    18.965ns       0.000       0.000       125.6
containers.frag_vec_nth_end_unchecked       55600000    13.256ns     0.000ns    13.256ns    13.256ns       0.000       0.000        67.6

Basically frag vector is 3.5x slower to sort than std::vector, because of the cost of access, and nth_element is 7x - 17x faster than std::sort for frag vector type, depending on if the element is near the end or middle.

Switching to unchecked iterator (since we can rely on std::algos to do the right thing) gives another ~1.4x boost.

emaxerrno commented 1 year ago

............. 🤯 YES! haha this is epic.

mmaslankaprv commented 1 year ago

This is great analysis. Hopefully the latency spike will be lowered now as we improved the resource cleanup in rm_stm: https://github.com/redpanda-data/redpanda/pull/11597

Also there are plans to move the cleanup completely from the critical part. We will benefit from the changes here in other places

github-actions[bot] commented 8 months ago

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

piyushredpanda commented 8 months ago

@mmaslankaprv was this closed out? If not, is it still relevant, we think?

mmaslankaprv commented 7 months ago

this can be closed as the cleanup is being made outside of the critical path now