apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.31k stars 3.21k forks source link

[Bug] 查询导致BE所有节点自动重启 #32356

Open Power098 opened 6 months ago

Power098 commented 6 months ago

Search before asking

Version

2.1.0

What's Wrong?

在执行查询后,会导致3台BE节点同时重启

What You Expected?

希望可以解决这个问题

How to Reproduce?

No response

Anything Else?

be.out日志 start time: 2024年 03月 18日 星期一 00:52:28 CST INFO: java_cmd /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64/bin/java INFO: jdk_version 8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/software/doris_be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/software/doris_be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/software/doris_be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] Query id: b8a84d2c2b7a4616-9ac9556c71940974 tablet id: 0 Aborted at 1710694904 (unix time) try "date -d @1710694904" if you are using GNU date Current BE git commitID: 91efb6a43d SIGSEGV unknown detail explain (@0x0) received by PID 43100 (TID 43863 OR 0x7fd168a8a700) from PID 0; stack trace: 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /home/zcp/repo_center/doris_release/doris/be/src/common/signal_handler.h:417 1# os::Linux::chained_handler(int, siginfo_t, void) in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo_t, void) in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.x86_64/jre/lib/amd64/server/libjvm.so 4# 0x00007FD2199DC400 in /lib64/libc.so.6 5# doris::vectorized::VExprContext::execute(doris::vectorized::Block, int) at /home/zcp/repo_center/doris_release/doris/be/src/vec/exprs/vexpr_context.cpp:50 6# doris::pipeline::JoinProbeLocalState<doris::pipeline::HashJoinSharedState, doris::pipeline::HashJoinProbeLocalState>::_build_output_block(doris::vectorized::Block, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/join_probe_operator.cpp:127 7# doris::pipeline::HashJoinProbeLocalState::filter_data_and_build_output(doris::RuntimeState, doris::vectorized::Block, bool, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/hashjoin_probe_operator.cpp:433 8# doris::pipeline::HashJoinProbeOperatorX::pull(doris::RuntimeState, doris::vectorized::Block, bool) const at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/exec/hashjoin_probe_operator.cpp:364 9# doris::pipeline::StatefulOperatorX::get_block(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/operator.cpp:459 10# doris::pipeline::OperatorXBase::get_block_after_projects(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/operator.cpp:210 11# doris::pipeline::StatefulOperatorX::get_block(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/operator.cpp:444 12# doris::pipeline::OperatorXBase::get_block_after_projects(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/operator.cpp:210 13# doris::pipeline::PipelineXTask::execute(bool) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:274 14# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /home/zcp/repo_center/doris_release/doris/be/src/pipeline/task_scheduler.cpp:334 15# doris::ThreadPool::dispatch_thread() in /opt/software/doris_be/lib/doris_be 16# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_release/doris/be/src/util/thread.cpp:499 17# start_thread in /lib64/libpthread.so.0 18# clone in /lib64/libc.so.6

be.gc日志: OpenJDK 64-Bit Server VM (25.332-b09) for linux-amd64 JRE (1.8.0_332-b09), built on May 10 2022 14:30:58 by "mockbuild" with gcc 4.8.5 20150623 (Red Hat 4.8.5-44) Memory: 4k page, physical 32892408k(13228836k free), swap 0k(0k free) CommandLine flags: -XX:-CriticalJNINatives -XX:InitialHeapSize=526278528 -XX:MaxHeapSize=1073741824 -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC 1.155: [GC (Allocation Failure) 129024K->7583K(493056K), 0.0159945 secs] 1.299: [GC (Allocation Failure) 136607K->3266K(622080K), 0.0053399 secs] 1.692: [GC (Allocation Failure) 261314K->6783K(622080K), 0.0081907 secs] 1.839: [GC (Metadata GC Threshold) 26950K->8543K(671232K), 0.0105617 secs] 1.849: [Full GC (Metadata GC Threshold) 8543K->7870K(501248K), 0.0257138 secs] 2.359: [GC (Allocation Failure) 315070K->10995K(483840K), 0.0053214 secs] 2.638: [GC (Allocation Failure) 318195K->8134K(490496K), 0.0038530 secs] 2.886: [GC (Allocation Failure) 314822K->8230K(491008K), 0.0023390 secs]

Are you willing to submit PR?

Code of Conduct

tukuw commented 1 month ago

please fix