apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.66k stars 3.27k forks source link

/mnt/disk2/ygl/code/github/apache-doris/be/src/exec/partitioned_aggregation_node.cc:88 #17866

Open typuc opened 1 year ago

typuc commented 1 year ago

问题描述:

运行中doris be 自动退出,be.out 日志

问题复现的case:

目前第一次出现

Doris版本:

1.1.5

Doris集群基本信息:

3 fe,3 be, fe 配置 24G内存,系统64G,CPU 48核心,ssd 硬盘

异常信息:

系统message (无oom)

Mar 16 16:13:54 bj-ucloud-0-56 abrt-hook-ccpp: Process 223133 (doris_be) of user 1000 killed by SIGSEGV - dumping core
Mar 16 16:14:34 bj-ucloud-0-56 abrt-hook-ccpp: Failed to create core_backtrace: waitpid failed: No child processes
Mar 16 16:14:34 bj-ucloud-0-56 abrt-server: Executable '/data/service/doris/common-doris-be/lib/doris_be' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Mar 16 16:14:34 bj-ucloud-0-56 abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2023-03-16-16:13:54-223133' exited with 1
Mar 16 16:14:34 bj-ucloud-0-56 abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2023-03-16-16:13:54-223133'
Mar 16 16:14:35 bj-ucloud-0-56 systemd-logind: Removed session 209774.

be.out 日志

*** Query id: 0-0 ***
*** Aborted at 1678954433 (unix time) try "date -d @1678954433" if you are using GNU date ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 223133 (TID 0x7f66f3840700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk2/ygl/code/github/apache-doris/be/src/common/signal_handler.h:428
 1# 0x00007F680906C400 in /lib64/libc.so.6
 2# doris::PartitionedAggregationNode::Partition::Close(bool) [clone .part.0] [clone .constprop.0] at /mnt/disk2/ygl/code/github/apache-doris/be/src/exec/partitioned_aggregation_node.cc:881
 3# doris::PartitionedAggregationNode::ClosePartitions() at /mnt/disk2/ygl/code/github/apache-doris/be/src/exec/partitioned_aggregation_node.cc:1403
 4# doris::PartitionedAggregationNode::close(doris::RuntimeState*) at /mnt/disk2/ygl/code/github/apache-doris/be/src/exec/partitioned_aggregation_node.cc:694
 5# doris::PlanFragmentExecutor::close() at /mnt/disk2/ygl/code/github/apache-doris/be/src/runtime/plan_fragment_executor.cpp:677
 6# doris::FragmentExecState::execute() at /mnt/disk2/ygl/code/github/apache-doris/be/src/runtime/fragment_mgr.cpp:252
 7# doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>) at /mnt/disk2/ygl/code/github/apache-doris/be/src/runtime/fragment_mgr
.cpp:487
 8# std::_Function_handler<void (), std::_Bind_result<void, void (doris::FragmentMgr::*(doris::FragmentMgr*, std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>
))(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>)> >::_M_invoke(std::_Any_data const&) at /mnt/disk2/ygl/installs/ldbtools/include/c++/11/bits/std_functio
n.h:291
 9# doris::ThreadPool::dispatch_thread() at /mnt/disk2/ygl/code/github/apache-doris/be/src/util/threadpool.cpp:578
10# doris::Thread::supervise_thread(void*) at /mnt/disk2/ygl/code/github/apache-doris/be/src/util/thread.cpp:407
11# start_thread in /lib64/libpthread.so.0
12# __clone in /lib64/libc.so.6

解决方案(社区技术人员或者其他用户给出的回复解决方案)


问题定位及解决以后,会在你的帖子下面给出解决方案,同时这样也会沉淀成Doris的知识文档方便后面新用户遇到问题快速找到解决方案
```</div>
miguelmin20 commented 7 months ago

有跟进信息么? 我们在2.05版本上也遇到了