StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.91k stars 1.79k forks source link

[StarOS] BE crash when cluster under high pressure with ddl & dml mixed scenarios #18646

Closed tiannan-sr closed 1 year ago

tiannan-sr commented 1 year ago

Steps to reproduce the behavior (Required)

duplicate/unique/aggregate/primary table:

insert delete stream load with delete mode stream load with upsert mode truncate alter table select Operations above execute concurrently and cluster under high pressure, BE crash.

be core:

#1680 0x0000000009b1252b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) ()
#1681 0x00000000077d730f in google::DumpStackTrace(int, void (*)(char const*, void*), void*) [clone .constprop.0] ()
#1682 0x00000000067ebcfc in __wrap___cxa_throw ()
#1683 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1684 0x0000000009b10f0a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long)
    ()
#1685 0x0000000009b1252b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) ()
#1686 0x00000000077d730f in google::DumpStackTrace(int, void (*)(char const*, void*), void*) [clone .constprop.0] ()
#1687 0x00000000067ebcfc in __wrap___cxa_throw ()
#1688 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1689 0x00000000044a5253 in fmt::v7::basic_memory_buffer<char, 500ul, std::allocator<char> >::grow(unsigned long) ()
#1690 0x0000000008b8e5cb in fmt::v7::basic_format_context<std::back_insert_iterator<fmt::v7::detail::buffer<char> >, char>::iterator fmt::v7::vformat_to<fmt::v7::detail::arg_formatter<std::back_insert_iterator<fmt::v7::detail::buffer<char> >, char>, char, fmt::v7::basic_format_context<std::back_insert_iterator<fmt::v7::detail::buffer<char> >, char> >(fmt::v7::detail::arg_formatter<std::back_insert_iterator<fmt::v7::detail::buffer<char> >, char>::iterator, fmt::v7::basic_string_view<char>, fmt::v7::basic_format_args<fmt::v7::basic_format_context<std::back_insert_iterator<fmt::v7::detail::buffer<char> >, char> >, fmt::v7::detail::locale_ref) ()
#1691 0x0000000008b83d1a in fmt::v7::detail::vformat[abi:cxx11](fmt::v7::basic_string_view<char>, fmt::v7::format_args) ()
#1692 0x00000000067ebdda in __wrap___cxa_throw ()
#1693 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1694 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1695 0x00000000067ebe7b in __wrap___cxa_throw ()
#1696 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1697 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1698 0x00000000067ebe7b in __wrap___cxa_throw ()
#1699 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1700 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1701 0x00000000067ebe7b in __wrap___cxa_throw ()
#1702 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1703 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1704 0x00000000067ebe7b in __wrap___cxa_throw ()
--Type <RET> for more, q to quit, c to continue without paging--
#1705 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1706 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1707 0x00000000067ebe7b in __wrap___cxa_throw ()
#1708 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1709 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1710 0x00000000067ebe7b in __wrap___cxa_throw ()
#1711 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1712 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1713 0x00000000067ebe7b in __wrap___cxa_throw ()
#1714 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1715 0x00000000077ca9d1 in google::LogMessage::Init(char const*, int, int, void (google::LogMessage::*)()) ()
#1716 0x00000000067ebe7b in __wrap___cxa_throw ()
#1717 0x0000000009a92af6 in operator new(unsigned long) [clone .cold] ()
#1718 0x00000000045a4165 in auto starrocks::type_dispatch_column<starrocks::ColumnBuilder, starrocks::TypeDescriptor, unsigned long>(starrocks::LogicalType, starrocks::ColumnBuilder, starrocks::TypeDescriptor, unsigned long) ()
#1719 0x000000000459584f in starrocks::ColumnHelper::create_column(starrocks::TypeDescriptor const&, bool, bool, unsigned long, bool) ()
#1720 0x000000000655fed1 in starrocks::serde::ProtobufChunkDeserializer::deserialize(std::basic_string_view<char, std::char_traits<char> >, long*) ()
#1721 0x000000000660efda in starrocks::LoadChannel::_deserialize_chunk(starrocks::ChunkPB const&, starrocks::Chunk&, starrocks::faststring*) ()
#1722 0x000000000660f743 in starrocks::LoadChannel::add_chunk(starrocks::PTabletWriterAddChunkRequest const&, starrocks::PTabletWriterAddBatchResult*) ()
#1723 0x00000000066086c2 in starrocks::LoadChannelMgr::add_chunk(starrocks::PTabletWriterAddChunkRequest const&, starrocks::PTabletWriterAddBatchResult*) ()
#1724 0x0000000006712255 in starrocks::BackendInternalServiceImpl<doris::PBackendService>::tablet_writer_add_chunk(google::protobuf::RpcController*, starrocks::PTabletWriterAddChunkRequest const*, starrocks::PTabletWriterAddBatchResult*, google::protobuf::Closure*) ()
#1725 0x0000000007af5d9d in brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*) ()
#1726 0x00000000079e9147 in brpc::ProcessInputMessage(void*) ()
#1727 0x00000000079ea01b in brpc::InputMessenger::OnNewMessages(brpc::Socket*) ()
#1728 0x0000000007994cfe in brpc::Socket::ProcessEvent(void*) ()
#1729 0x000000000795fc1f in bthread::TaskGroup::task_runner(long) ()
#1730 0x00000000079527c1 in bthread_make_fcontext ()
Backtrace stopped: Cannot access memory at address 0x7fc1d80fa000
(gdb)
(gdb)
(gdb)

Expected behavior (Required)

Real behavior (Required)

StarRocks version (Required)

tracymacding commented 1 year ago

fixed in https://github.com/StarRocks/starrocks/pull/19089

tiannan-sr commented 1 year ago

fixed