scylladb / scylladb

NoSQL data store using the seastar framework, compatible with Apache Cassandra
http://scylladb.com
GNU Affero General Public License v3.0
13.49k stars 1.28k forks source link

Storm of Mini Flushes slows down scylla heavily #21128

Open lucowehrlin opened 1 week ago

lucowehrlin commented 1 week ago

Still worried about our findings here: https://forum.scylladb.com/t/compaction-storm-slows-down-scylla/2958/24

While one observation seems to have resulted in a fixable bug, (https://github.com/scylladb/scylladb/issues/20991), we are still not understanding what caused the "storm" of mini flushes, that slowed down the database heavily.

Image

One observation was that we seen multiple memory issues during the "storm", showing some oversized allocation, and shortly thereafter multiple LSA related errors, like the following below.

Oct 15 15:37:22 o-prod-L3-5 scylla[598821]:  [shard 0:stmt] seastar_memory - oversized allocation: 2768896 bytes. This is non-fatal, but could lead to latency and/or fragmentation issues. Please report: at
[Backtrace #0]
void seastar::backtrace<seastar::current_backtrace_tasklocal()::$_0>(seastar::current_backtrace_tasklocal()::$_0&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::current_backtrace_tasklocal() at ./build/release/seastar/./seastar/src/util/backtrace.cc:97
seastar::current_tasktrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:148
seastar::current_backtrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:181
seastar::memory::cpu_pages::warn_large_allocation(unsigned long) at ./build/release/seastar/./seastar/src/core/memory.cc:849
 (inlined by) seastar::memory::cpu_pages::check_large_allocation(unsigned long) at ./build/release/seastar/./seastar/src/core/memory.cc:912
 (inlined by) seastar::memory::cpu_pages::allocate_large(unsigned int, bool) at ./build/release/seastar/./seastar/src/core/memory.cc:919
 (inlined by) seastar::memory::allocate_large(unsigned long, bool) at ./build/release/seastar/./seastar/src/core/memory.cc:1542
 (inlined by) seastar::memory::allocate_slowpath(unsigned long) at ./build/release/seastar/./seastar/src/core/memory.cc:1688
seastar::memory::allocate(unsigned long) at ./build/release/seastar/./seastar/src/core/memory.cc:1707
 (inlined by) malloc at ./build/release/seastar/./seastar/src/core/memory.cc:2216
seastar::basic_sstring<signed char, unsigned int, 31u, false>::basic_sstring(seastar::basic_sstring<signed char, unsigned int, 31u, false>::initialized_later, unsigned long) at ././seastar/include/seastar/core/sstring.hh:164
 (inlined by) ser::buffer_view<bytes_ostream::fragment_iterator>::linearize() const at ././serializer.hh:156
 (inlined by) cql3::selection::result_set_builder::get_value(seastar::shared_ptr<abstract_type const>, query::result_atomic_cell_view) at ./cql3/selection/selection.cc:794
cql3::selection::result_set_builder::add(column_definition const&, query::result_atomic_cell_view const&) at ./cql3/selection/selection.cc:537
cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter>::add_value(column_definition const&, query::result_row_view::iterator_type&) at ././cql3/selection/selection.hh:288
 (inlined by) cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter>::accept_new_row(query::result_row_view const&, query::result_row_view const&) at ././cql3/selection/selection.hh:328
cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter>::accept_new_row(clustering_key_prefix const&, query::result_row_view const&, query::result_row_view const&) at ././cql3/selection/selection.hh:305
 (inlined by) service::pager::query_pager::query_result_visitor<cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter> >::accept_new_row(clustering_key_prefix const&, query::result_row_view const&, query::result_row_view const&) at ./service/pager/query_pagers.cc:361
void query::result_view::consume<service::pager::query_pager::query_result_visitor<cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter> >&>(query::partition_slice const&, service::pager::query_pager::query_result_visitor<cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter> >&) const at ././query-result-reader.hh:168
void service::pager::query_pager::handle_result<cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter> >(cql3::selection::result_set_builder::visitor<cql3::selection::result_set_builder::nop_filter>&&, seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> > const&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >) at ./service/pager/query_pagers.cc:396
service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}::operator()() at ./service/pager/query_pagers.cc:216
 (inlined by) seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > seastar::futurize<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::invoke<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}>(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}&&) at ././seastar/include/seastar/core/future.hh:2037
 (inlined by) auto seastar::futurize_invoke<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}>(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}&&) at ././seastar/include/seastar/core/future.hh:2066
 (inlined by) auto cql3::selection::result_set_builder::with_thread_if_needed<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}>(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const::{lambda()#1}&&) at ././cql3/selection/selection.hh:191
 (inlined by) service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0::operator()(service::storage_proxy_coordinator_query_result) const at ./service/pager/query_pagers.cc:215
 (inlined by) seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > seastar::futurize<seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >::invoke<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&, service::storage_proxy_coordinator_query_result>(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&, service::storage_proxy_coordinator_query_result&&) at ././seastar/include/seastar/core/future.hh:2035
 (inlined by) auto seastar::futurize_invoke<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&, service::storage_proxy_coordinator_query_result>(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&, service::storage_proxy_coordinator_query_result&&) at ././seastar/include/seastar/core/future.hh:2066
 (inlined by) utils::internal::result_wrapped_call_traits<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>, false>::invoke_with_value(service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>&&) at ././utils/result_combinators.hh:73
 (inlined by) auto utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>::operator()<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >(boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>) at ././utils/result_combinators.hh:124
seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > std::__invoke_impl<seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >(std::__invoke_other, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::type std::__invoke<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) std::invoke_result<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::type std::invoke<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/functional:113
 (inlined by) auto seastar::internal::future_invoke<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy>&&) at ././seastar/include/seastar/core/future.hh:1174
 (inlined by) seastar::future<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::then_impl_nrvo<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>, seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&&)::{lambda(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&) const::{lambda()#1}::operator()() const at ././seastar/include/seastar/core/future.hh:1488
 (inlined by) void seastar::futurize<seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >::satisfy_with_result_of<seastar::future<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::then_impl_nrvo<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>, seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&&)::{lambda(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&&) at ././seastar/include/seastar/core/future.hh:2020
 (inlined by) seastar::future<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::then_impl_nrvo<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>, seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&&)::{lambda(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&)#1}::operator()(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&) const at ././seastar/include/seastar/core/future.hh:1484
 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>, seastar::future<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::then_impl_nrvo<utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>, seastar::future<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> > >(utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&&)::{lambda(seastar::internal::promise_base_with_type<boost::outcome_v2::basic_result<void, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&, utils::internal::result_wrapper<service::pager::query_pager::fetch_page_result(cql3::selection::result_set_builder&, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::$_0, false>&, seastar::future_state<boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >&&)#1}, boost::outcome_v2::basic_result<service::storage_proxy_coordinator_query_result, utils::exception_container<exceptions::mutation_write_timeout_exception, exceptions::read_timeout_exception, exceptions::read_failure_exception, exceptions::rate_limit_exception>, utils::exception_container_throw_policy> >::run_and_dispose() at ././seastar/include/seastar/core/future.hh:748
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2690
 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:3152
seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:3320
seastar::reactor::run() at ./build/release/seastar/./seastar/src/core/reactor.cc:3210
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:276
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:167
scylla_main(int, char**) at ./main.cc:700
std::function<int (int, char**)>::operator()(int, char**) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/std_function.h:591
main at ./main.cc:2211
/data/scylla-s3-reloc.cache/by-build-id/f42d84b1312f866e5b14a8a8ecc8ffff0e4d5d19/extracted/scylla/libreloc/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9148cab1b932d44ef70e306e9c02ee38d06cad51, for GNU/Linux 3.2.0, not stripped

__libc_start_call_main at ??:?
__libc_start_main_alias_2 at :?
_start at ??:?

The other memory related issue was this:

Oct 14 21:17:40 o-prod-L3-8 scylla[679765]:  [shard 5:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x605006314620 to 2 segments; trace:
[Backtrace #0]
void seastar::backtrace<seastar::current_backtrace_tasklocal()::$_0>(seastar::current_backtrace_tasklocal()::$_0&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:68
 (inlined by) seastar::current_backtrace_tasklocal() at ./build/release/seastar/./seastar/src/util/backtrace.cc:97
seastar::current_tasktrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:148
seastar::current_backtrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:181
logalloc::allocating_section::on_alloc_failure(logalloc::region&) at ./utils/logalloc.cc:2955
decltype(auto) logalloc::allocating_section::with_reclaiming_disabled<partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&>(logalloc::region&, partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&) at ././utils/logalloc.hh:510
 (inlined by) logalloc::allocating_section::operator()<partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}>(logalloc::region&, partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&&)::{lambda()#1}::operator()() const at ././utils/logalloc.hh:530
 (inlined by) decltype(auto) logalloc::allocating_section::with_reserve<logalloc::allocating_section::operator()<partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}>(logalloc::region&, partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&&)::{lambda()#1}>(logalloc::region&, partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&&) at ././utils/logalloc.hh:474
 (inlined by) decltype(auto) logalloc::allocating_section::operator()<partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}>(logalloc::region&, partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()()::{lambda()#1}&&) at ././utils/logalloc.hh:529
 (inlined by) partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0::operator()() at ./mutation/partition_version.cc:521
 (inlined by) seastar::noncopyable_function<seastar::bool_class<seastar::stop_iteration_tag> ()>::indirect_vtable_for<partition_entry::apply_to_incomplete(schema const&, partition_entry&&, mutation_cleaner&, logalloc::allocating_section&, logalloc::region&, cache_tracker&, unsigned long, real_dirty_memory_accounter&, basic_preemption_source&)::$_0>::call(seastar::noncopyable_function<seastar::bool_class<seastar::stop_iteration_tag> ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:158
seastar::noncopyable_function<seastar::bool_class<seastar::stop_iteration_tag> ()>::operator()() const at ././seastar/include/seastar/util/noncopyable_function.hh:215
 (inlined by) utils::coroutine::run() at ././utils/coroutine.hh:39
 (inlined by) row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#3}::operator()() const at ./row_cache.cc:1074
 (inlined by) decltype(auto) with_allocator<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#3}>(allocation_strategy&, row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}::operator()()::{lambda()#3}&&) at ././utils/allocation_strategy.hh:318
 (inlined by) row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}::operator()() at ./row_cache.cc:1053
 (inlined by) void std::__invoke_impl<void, row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>(std::__invoke_other, row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:61
 (inlined by) std::__invoke_result<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>::type std::__invoke<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>(row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/invoke.h:96
 (inlined by) decltype(auto) std::__apply_impl<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}, std::tuple<>>(row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2288
 (inlined by) decltype(auto) std::apply<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}, std::tuple<> >(row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/tuple:2299
 (inlined by) seastar::future<void> seastar::futurize<void>::apply<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>(row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2000
 (inlined by) seastar::async<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::thread_attributes, row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&)::{lambda()#1}::operator()() const at ././seastar/include/seastar/core/thread.hh:260
 (inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::async<row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}>(seastar::thread_attributes, row_cache::do_update<row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0>(row_cache::external_updater, replica::memtable&, row_cache::update(row_cache::external_updater, replica::memtable&, basic_preemption_source&)::$_0, basic_preemption_source&)::{lambda()#1}::operator()() const::{lambda()#2}&&)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:129
seastar::noncopyable_function<void ()>::operator()() const at ./build/release/seastar/./seastar/include/seastar/util/noncopyable_function.hh:215
 (inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:311

Oct 15 01:17:48 osapiens-prod-L3-8 scylla[679765]:  [shard 5:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x605006314620 to 2 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 2 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 4 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 8 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 16 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 32 segments; trace:
[Backtrace #0] Already seen, not resolving again.

Oct 15 01:17:49 osapiens-prod-L3-8 scylla[679765]:  [shard 1:mt2c] lsa - LSA allocation failure, increasing reserve in section 0x601006304620 to 64 segments; trace:
[Backtrace #0] Already seen, not resolving again.
horschi commented 1 week ago

We had the issue twice that scylla did a lot of very small flushes, causing many small compactions. This caused writes to become very slow.

Since #20991 does not seem to be root cause, it might be that the flushes were somehow memory related. What we can see is a lot of allocation failures in our logs.

mykaul commented 1 week ago

@ptrsmrn - the allocation above seem to be in the CQL territory - can you take a look?

horschi commented 1 week ago

FYI: We are often times reading/writing blobs up to 3MB in size.

Question is: Can this lead to some kind of low memory situation where scylla 6.0 is getting into a situations where its constantly flushes?

ptrsmrn commented 1 week ago

CQL allocates: oversized allocation: 2768896 bytes, which is roughly what @horschi reported: FYI: We are often times reading/writing blobs up to 3MB in size. @bhalevy are you aware what can cause so many mini flushes? Also, the 2nd stacktrace mentions using row_cache and LSA allocation failure - this is some hint, but I am not familiar with this area.

avikivity commented 1 week ago

What's the underlying counter for "Memtable switches"? I can't find it in scylla-monitoring.git.

Please upload a snapshot of your monitoring database, and indicate a time period to look at,maybe we can find a clue there.

horschi commented 1 week ago

What's the underlying counter for "Memtable switches"? I can't find it in scylla-monitoring.git.

rate(scylla_column_family_memtable_switch[1m])

Please upload a snapshot of your monitoring database, and indicate a time period to look at,maybe we can find a clue there.

I will see that we provide that.

horschi commented 1 week ago

Close up on the time when we restarted:

On a per CF basis:

(rate(scylla_column_family_memtable_switch[1m]))

Image

It shows a couple of hosts having that flush-storm and with the restart the numbers go down. The flushes are on all active tables.

Per host basis:

sum by (instance) (rate(scylla_column_family_memtable_switch[1m]))

Image

Edit: added second screenshot