apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.31k stars 3.21k forks source link

[Coredump] data sink node close causes coredump #5664

Open xinghuayu007 opened 3 years ago

xinghuayu007 commented 3 years ago

Describe the bug

branch: Doris-0.13 core content:

0 free (this=, ptr=) at /builds/olap/doris/be/src/runtime/free_pool.hpp:94

1 doris_udf::FunctionContext::free (this=this@entry=0x64df63328, buffer=) at /builds/olap/doris/be/src/udf/udf.cpp:308

2 0x0000000000e06595 in doris::CaseExpr::close (this=0x19a01a6a80, state=0x1d5489f500, ctx=0x1beb4b41e0, scope=doris_udf::FunctionContext::FRAGMENT_LOCAL)

at /builds/olap/doris/be/src/exprs/case_expr.cpp:74

3 0x0000000000d76583 in doris::ExprContext::close (this=0x1beb4b41e0, state=state@entry=0x1d5489f500) at /builds/olap/doris/be/src/exprs/expr_context.cpp:89

4 0x0000000000d6dfd0 in doris::Expr::close (ctxs=…, state=state@entry=0x1d5489f500) at /builds/olap/doris/be/src/exprs/expr.cpp:578

5 0x00000000016c97bc in doris::DataStreamSender::close (this=0x1b2d9e5380, state=0x1d5489f500, exec_status=…) at /builds/olap/doris/be/src/runtime/data_stream_sender.cpp:673

6 0x000000000109906e in doris::PlanFragmentExecutor::close (this=this@entry=0xe72828670) at /builds/olap/doris/be/src/runtime/plan_fragment_executor.cpp:562

7 0x000000000101b7e1 in doris::FragmentExecState::execute (this=0xe72828600) at /builds/olap/doris/be/src/runtime/fragment_mgr.cpp:222

8 0x000000000101df36 in doris::FragmentMgr::exec_actual(std::shared_ptr, std::function<void (doris::PlanFragmentExecutor)>) (this=0x521c600, exec_state=…, cb=…)

at /builds/olap/doris/be/src/runtime/fragment_mgr.cpp:430

9 0x000000000102362c in __invoke_impl<void, void (doris::FragmentMgr::&)(std::shared_ptr, std::function<void(doris::PlanFragmentExecutor)>), doris::FragmentMgr&, std::shared_ptr&, std::function<void(doris::PlanFragmentExecutor)>&> (t=@0x1b5f0d4630: 0x521c600, f=

@0x1b5f0d45f0: (void (doris::FragmentMgr::)(doris::FragmentMgr * const, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor)>)) 0x101df10 <doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor)>)>) at /usr/include/c++/7.3.0/bits/invoke.h:73

10 invoke<void (doris::FragmentMgr::&)(std::shared_ptr, std::function<void(doris::PlanFragmentExecutor)>), doris::FragmentMgr&, std::shared_ptr&, std::function<void(doris::PlanFragmentExecutor)>&> (fn=

@0x1b5f0d45f0: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor)>)) 0x101df10 <doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor)>)>) at /usr/include/c++/7.3.0/bits/invoke.h:95

11 call<void, 0, 1, 2> (args=…, this=0x1b5f0d45f0) at /usr/include/c++/7.3.0/functional:632

12 operator()<> (this=0x1b5f0d45f0) at /usr/include/c++/7.3.0/functional:718

13 std::_Function_handler<void (), std::_Bind_result<void, void (doris::FragmentMgr::(doris::FragmentMgr, std::shared_ptr, std::function<void (doris::PlanFragmentExecutor)>))(std::shared_ptr, std::function<void (doris::PlanFragmentExecutor)>)> >::_M_invoke(std::_Any_data const&) (__functor=…) at /usr/include/c++/7.3.0/bits/std_function.h:316

14 0x00000000011aa902 in operator() (this=0x1c7912e698) at /usr/include/c++/7.3.0/bits/std_function.h:706

15 run (this=0x1c7912e690) at /builds/olap/doris/be/src/util/threadpool.cpp:42

16 doris::ThreadPool::dispatch_thread (this=0x85eb6b40) at /builds/olap/doris/be/src/util/threadpool.cpp:551

17 0x00000000011a2768 in operator() (this=0x19d4d968) at /usr/include/c++/7.3.0/bits/std_function.h:706

18 doris::Thread::supervise_thread (arg=0x19d4d950) at /builds/olap/doris/be/src/util/thread.cpp:385

19 0x00007f7f3ee8fdc5 in start_thread () from /lib64/libpthread.so.0

20 0x00007f7f3f19b73d in clone () from /lib64/libc.so.6

in stack frame 0, the content is : (gdb) p node $74 = {{next = 0x0, list = 0x0}} (gdb) p list $75 = (doris::FreePool::FreeListNode ) 0x0 (gdb)

that means list is null, list->next causes null pointer.

I do not know how this happend.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

levy5307 commented 2 years ago

we have encourtered the same problem in our online cluster today :(