Open xiongzegang opened 1 year ago
尝试把集群降级到1.2版本,集群依然会有2个be节点会自动退出be.out日志:
start time: 2023年 01月 14日 星期六 11:07:27 CST
F0114 11:11:42.756687 793 column_impl.h:67] Size of selector: 1, is larger than size of column:0
Check failure stack trace:
@ 0x7f5a146e3d4d google::LogMessage::Fail()
@ 0x7f5a146e6289 google::LogMessage::SendToLog()
@ 0x7f5a146e38b6 google::LogMessage::Flush()
@ 0x7f5a146e68f9 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f5a11301f8a doris::vectorized::ColumnVector<>::append_data_by_selector()
@ 0x7f5a118dce05 doris::vectorized::Block::append_block_by_selector()
@ 0x7f5a14081c9b doris::stream_load::VNodeChannel::add_block()
@ 0x7f5a14087bb0 doris::stream_load::VOlapTableSink::send()
@ 0x7f5a1089a2b9 doris::PlanFragmentExecutor::open_vectorized_internal()
@ 0x7f5a1089ba25 doris::PlanFragmentExecutor::open()
@ 0x7f5a10873dbc doris::FragmentExecState::execute()
@ 0x7f5a108771eb doris::FragmentMgr::_exec_actual()
@ 0x7f5a1087788a _ZNSt17_Function_handlerIFvvEZN5doris11FragmentMgr18exec_plan_fragmentERKNS1_23TExecPlanFragmentParamsESt8functionIFvPNS1_20PlanFragmentExecutorEEEEUlvE_E9_M_invokeERKSt9_Any_data
@ 0x7f5a10b22815 doris::ThreadPool::dispatch_thread()
@ 0x7f5a10b18bff doris::Thread::supervise_thread()
@ 0x7f5a0ac0ddc5 start_thread
@ 0x7f5a0af1973d clone
@ (nil) (unknown)
Query id: 388e20e98c4642aa-bd5f30fd9aabf906
Aborted at 1673665903 (unix time) try "date -d @1673665903" if you are using GNU date
Current BE git commitID: Unknown
SIGABRT unkown detail explain (@0x216) received by PID 534 (TID 0x7f59795ec700) from PID 534; stack trace:
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /root/doris/be/src/common/signal_handler.h:420
1# 0x00007F5A0AE57250 in /lib64/libc.so.6
2# raise in /lib64/libc.so.6
3# GI_abort in /lib64/libc.so.6
4# 0x00007F5A146EE739 in /opt/apache-doris-be-1.2.0-bin-x86_64/lib/doris_be
5# 0x00007F5A146E3D4D in /opt/apache-doris-be-1.2.0-bin-x86_64/lib/doris_be
6# google::LogMessage::SendToLog() in /opt/apache-doris-be-1.2.0-bin-x86_64/lib/doris_be
7# google::LogMessage::Flush() in /opt/apache-doris-be-1.2.0-bin-x86_64/lib/doris_be
8# google::LogMessageFatal::~LogMessageFatal() in /opt/apache-doris-be-1.2.0-bin-x86_64/lib/doris_be
9# doris::vectorized::ColumnVector
什么操作导致be奔掉的?
经排查可能是因为datetimev2和datev2导致的,之前表是使用datetime和date。之前使用datetime集群正常跑业务,后面改成datetimev2后,就出现这样问题。后面又把datetimev2改回datetime了,集群稳定跑了3天了。数据导入主要使用routine load和insert into xxx select * from xxx。
version: 1.2.4
insert into innertable select * from hive
F0712 15:56:50.480654 26192 column_impl.h:67] Size of selector: 1607, is larger than size of column:1258
*** Check failure stack trace: ***
@ 0x565436894eed google::LogMessage::Fail()
@ 0x565436897429 google::LogMessage::SendToLog()
@ 0x565436894a56 google::LogMessage::Flush()
@ 0x565436897a99 google::LogMessageFatal::~LogMessageFatal()
@ 0x565432497a98 doris::vectorized::ColumnNullable::append_data_by_selector()
@ 0x5654325365c5 doris::vectorized::Block::append_block_by_selector()
@ 0x56543621b538 doris::stream_load::VNodeChannel::add_block()
@ 0x5654362214d5 doris::stream_load::VOlapTableSink::send()
@ 0x56543146be89 doris::PlanFragmentExecutor::open_vectorized_internal()
@ 0x56543146cdda doris::PlanFragmentExecutor::open()
@ 0x565431444dae doris::FragmentExecState::execute()
@ 0x5654314481a5 doris::FragmentMgr::_exec_actual()
@ 0x5654314488ca _ZNSt17_Function_handlerIFvvEZN5doris11FragmentMgr18exec_plan_fragmentERKNS1_23TExecPlanFragmentParamsESt8functionIFvPNS1_20PlanFragmentExecutorEEEEUlvE_E9_M_invokeERKSt9_Any_data
@ 0x5654316feb15 doris::ThreadPool::dispatch_thread()
@ 0x5654316f424f doris::Thread::supervise_thread()
@ 0x7f2c08219ea5 start_thread
@ 0x7f2c0852c8dd __clone
@ (nil) (unknown)
*** Query id: 798ba6186f174202-a6cbf439b8193a86 ***
*** Aborted at 1689148586 (unix time) try "date -d @1689148586" if you are using GNU date ***
*** Current BE git commitID: Unknown ***
*** SIGSEGV address not mapped to object (@0x7f861d804000) received by PID 32341 (TID 0x7f7e9c91f700) from PID 494944256; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420
1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
4# 0x00007F88992B7400 in /lib64/libc.so.6
5# __memmove_ssse3_back in /lib64/libc.so.6
6# doris::vectorized::ColumnString::insert_from(doris::vectorized::IColumn const&, unsigned long) at /root/doris/be/src/vec/columns/column_string.h:156
7# doris::vectorized::ColumnNullable::append_data_by_selector(COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, doris::vectorized::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&) const at /root/doris/be/src/vec/columns/column_nullable.h:197
8# doris::vectorized::Block::append_block_by_selector(std::vector<COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>, std::allocator<COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn> > >&, doris::vectorized::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&) const at /root/doris/be/src/vec/core/block.cpp:683
9# doris::stream_load::VNodeChannel::add_block(doris::vectorized::Block*, std::pair<std::unique_ptr<doris::vectorized::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>, std::default_delete<doris::vectorized::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> > >, std::vector<long, std::allocator<long> > > const&) at /root/doris/be/src/vec/sink/vtablet_sink.cpp:214
10# doris::stream_load::VOlapTableSink::send(doris::RuntimeState*, doris::vectorized::Block*) at /root/doris/be/src/vec/sink/vtablet_sink.cpp:607
11# doris::PlanFragmentExecutor::open_vectorized_internal() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:322
12# doris::PlanFragmentExecutor::open() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:261
13# doris::FragmentExecState::execute() at /root/doris/be/src/runtime/fragment_mgr.cpp:261
14# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>) at /root/doris/be/src/runtime/fragment_mgr.cpp:508
15# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::PlanFragmentExecutor*)>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291
16# doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:543
17# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:455
18# start_thread in /lib64/libpthread.so.0
19# __clone in /lib64/libc.so.6
Program terminated with signal 11, Segmentation fault.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 (gdb) bt
at /data/doris-1.x/be/src/vec/columns/column_string.cpp:110
at /data/doris-1.x/be/src/vec/common/cow.h:208
at /var/local/ldb-toolchain/include/c++/11/bits/hashtable_policy.h:337
at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290
at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290
*** SIGABRT unkown detail explain (@0x3e800074b4d) received by PID 478029 (TID 0x7f3dade10700) from PID 478029; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /data/doris-1.x/be/src/common/signal_handler.h:420
1# 0x00007F3ED11FA400 in /lib64/libc.so.6
2# __GI_raise in /lib64/libc.so.6
3# abort in /lib64/libc.so.6
4# 0x0000562E133E21F9 in /usr/local/service/doris/lib/be/doris_be
5# 0x0000562E133D780D in /usr/local/service/doris/lib/be/doris_be
6# google::LogMessage::SendToLog() in /usr/local/service/doris/lib/be/doris_be
7# google::LogMessage::Flush() in /usr/local/service/doris/lib/be/doris_be
8# google::LogMessageFatal::~LogMessageFatal() in /usr/local/service/doris/lib/be/doris_be
9# doris::vectorized::ColumnVector<int>::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) const at /data/doris-1.x/be/src/vec/columns/column_vector.cpp:389
10# doris::vectorized::ColumnNullable::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) const at /data/doris-1.x/be/src/vec/columns/column_nullable.cpp:309
11# doris::vectorized::Block::filter_block_internal(doris::vectorized::Block*, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&) at /data/doris-1.x/be/src/vec/core/block.cpp:654
12# doris::vectorized::Block::filter_block(doris::vectorized::Block*, std::vector<unsigned int, std::allocator<unsigned int> > const&, int, int) at /data/doris-1.x/be/src/vec/core/block.cpp:714
13# doris::vectorized::Block::filter_block(doris::vectorized::Block*, int, int) at /data/doris-1.x/be/src/vec/core/block.cpp:739
14# doris::vectorized::VExprContext::filter_block(doris::vectorized::VExprContext*, doris::vectorized::Block*, int) at /data/doris-1.x/be/src/vec/exprs/vexpr_context.cpp:127
15# doris::vectorized::VScanner::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /data/doris-1.x/be/src/vec/exec/scan/vscanner.cpp:68
我也遇到了相似的错误堆栈,1.2.6
Query id: 798ba6186f174202-a6cbf439b8193a86
Aborted at 1689148586 (unix time) try "date -d @1689148586" if you are using GNU date
Current BE git commitID: Unknown
SIGSEGV address not mapped to object (@0x7f861d804000) received by PID 32341 (TID 0x7f7e9c91f700) from PID 494944256; stack trace:
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /root/doris/be/src/common/signal_handler.h:420
1# os::Linux::chained_handler(int, siginfo_t, void) in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
3# signalHandler(int, siginfo_t, void) in /usr/lib/jvm/java-1.8.0/jre/lib/amd64/server/libjvm.so
4# 0x00007F88992B7400 in /lib64/libc.so.6
5# __memmove_ssse3_back in /lib64/libc.so.6
6# doris::vectorized::ColumnString::insert_from(doris::vectorized::IColumn const&, unsigned long) at /root/doris/be/src/vec/columns/column_string.h:156
7# doris::vectorized::ColumnNullable::append_data_by_selector(COW
来信已经收到 我会尽快查看
Search before asking
Version
版本1.2.1,下载的官方安装包
What's Wrong?
3fe-5be集群5个be节点全部崩溃掉,重新启动5个be节点始终会有其中2个节点不能启动,be.out日志如下: start time: 2023年 01月 09日 星期一 22:21:51 CST F0113 22:17:07.946166 50147 column_vector.cpp:389] Size of filter doesn't match size of column. data size: 0, filter size: 1 @ 0x7f26967789ea doris::vectorized::ColumnVector<>::filter() @ 0x7f2696700c6b doris::vectorized::ColumnNullable::filter() @ 0x7f26967a7cbb doris::vectorized::Block::filter_block_internal() @ 0x7f26967ac85c doris::vectorized::Block::filter_block() @ 0x7f26967ac902 doris::vectorized::Block::filter_block() @ 0x7f2697522e06 doris::vectorized::VExprContext::filter_block() @ 0x7f269912f011 doris::vectorized::VScanner::get_block() @ 0x7f269912c39f doris::vectorized::ScannerScheduler::_scanner_scan() @ 0x7f269597f805 doris::ThreadPool::dispatch_thread() @ 0x7f2695975bef doris::Thread::supervise_thread() @ 0x7f268f900dc5 start_thread @ 0x7f268fc0c73d __clone @ (nil) (unknown) Check failure stack trace: @ 0x7f26995e8e0d google::LogMessage::Fail() @ 0x7f26995eb349 google::LogMessage::SendToLog() @ 0x7f26995e8976 google::LogMessage::Flush() @ 0x7f26995eb9b9 google::LogMessageFatal::~LogMessageFatal() @ 0x7f2696778a21 doris::vectorized::ColumnVector<>::filter() @ 0x7f2696700c6b doris::vectorized::ColumnNullable::filter() @ 0x7f26967a7cbb doris::vectorized::Block::filter_block_internal() @ 0x7f26967ac85c doris::vectorized::Block::filter_block() @ 0x7f26967ac902 doris::vectorized::Block::filter_block() @ 0x7f2697522e06 doris::vectorized::VExprContext::filter_block() @ 0x7f269912f011 doris::vectorized::VScanner::get_block() @ 0x7f269912c39f doris::vectorized::ScannerScheduler::_scanner_scan() @ 0x7f269597f805 doris::ThreadPool::dispatch_thread() @ 0x7f2695975bef doris::Thread::supervise_thread() @ 0x7f268f900dc5 start_thread @ 0x7f268fc0c73d clone @ (nil) (unknown) Query id: 4159b2473e6a4e06-a77d142ed2aad4b2 Aborted at 1673619428 (unix time) try "date -d @1673619428" if you are using GNU date Current BE git commitID: Unknown SIGABRT unkown detail explain (@0xc376) received by PID 50038 (TID 0x7f2637df8700) from PID 50038; stack trace: 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /root/doris/be/src/common/signal_handler.h:420 1# 0x00007F268FB4A250 in /lib64/libc.so.6 2# raise in /lib64/libc.so.6 3# GI_abort in /lib64/libc.so.6 4# 0x00007F26995F37F9 in /opt/apache-doris-be-1.2.1-bin-x86_64/lib/doris_be 5# 0x00007F26995E8E0D in /opt/apache-doris-be-1.2.1-bin-x86_64/lib/doris_be 6# google::LogMessage::SendToLog() in /opt/apache-doris-be-1.2.1-bin-x86_64/lib/doris_be 7# google::LogMessage::Flush() in /opt/apache-doris-be-1.2.1-bin-x86_64/lib/doris_be 8# google::LogMessageFatal::~LogMessageFatal() in /opt/apache-doris-be-1.2.1-bin-x86_64/lib/doris_be 9# doris::vectorized::ColumnVector::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) const at /root/doris/be/src/vec/columns/column_vector.cpp:389
10# doris::vectorized::ColumnNullable::filter(doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) const at /root/doris/be/src/vec/columns/column_nullable.cpp:309
11# doris::vectorized::Block::filter_block_internal(doris::vectorized::Block, std::vector<unsigned int, std::allocator > const&, doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul> const&) at /root/doris/be/src/vec/core/block.cpp:641
12# doris::vectorized::Block::filter_block(doris::vectorized::Block , std::vector<unsigned int, std::allocator > const&, int, int) at /root/doris/be/src/vec/core/block.cpp:712
13# doris::vectorized::Block::filter_block(doris::vectorized::Block, int, int) at /root/doris/be/src/vec/core/block.cpp:726
14# doris::vectorized::VExprContext::filter_block(doris::vectorized::VExprContext, doris::vectorized::Block, int) at /root/doris/be/src/vec/exprs/vexpr_context.cpp:122
15# doris::vectorized::VScanner::get_block(doris::RuntimeState, doris::vectorized::Block, bool) at /root/doris/be/src/vec/exec/scan/vscanner.cpp:64
16# doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler, doris::vectorized::ScannerContext, doris::vectorized::VScanner) at /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:234
17# doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:542
18# doris::Thread::supervise_thread(void) at /root/doris/be/src/util/thread.cpp:455
19# start_thread in /lib64/libpthread.so.0
20# clone in /lib64/libc.so.6
What You Expected?
希望开发组能尽快定位问题
How to Reproduce?
No response
Anything Else?
No response
Are you willing to submit PR?
Code of Conduct