apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.8k stars 3.29k forks source link

[Bug] be crash #30563

Open xingyingone opened 10 months ago

xingyingone commented 10 months ago

Search before asking

Version

doris-2.0.3-rc06-37d31a5

What's Wrong?

be crash

  1. be.out start time: Fri Dec 22 18:37:27 CST 2023 INFO: java_cmd /home/olap/doris/java8/bin/java INFO: jdk_version 8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/olap/doris/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'. start time: Fri Dec 22 19:24:57 CST 2023 INFO: java_cmd /home/olap/doris/java8/bin/java INFO: jdk_version 8 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/java_extensions/preload-extensions/preload-extensions-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/java_extensions/java-udf/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/olap/doris/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/olap/doris/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'. Query id: 30621429841a475a-836b6363f450c76c tablet id: 0 Aborted at 1706596268 (unix time) try "date -d @1706596268" if you are using GNU date Current BE git commitID: 37d31a5 SIGSEGV unknown detail explain (@0x0) received by PID 6514 (TID 6952 OR 0x7f24aab92700) from PID 0; stack trace: 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /root/src/doris-2.0/be/src/common/signal_handler.h:417 1# os::Linux::chained_handler(int, siginfo, void) in /home/olap/doris/java8/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /home/olap/doris/java8/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo, void) in /home/olap/doris/java8/jre/lib/amd64/server/libjvm.so 4# 0x00007F27D8BC3400 in /lib64/libc.so.6 5# bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::_destroy_tls_blocks() at /var/local/thirdparty/installed/include/bvar/detail/agent_group.h:166 6# 0x000055ADFF4EA32E in /home/olap/doris/be/lib/doris_be 7# __nptl_deallocate_tsd in /lib64/libpthread.so.0 8# start_thread in /lib64/libpthread.so.0 9# clone in /lib64/libc.so.6

  2. stack [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Missing separate debuginfo for /home/olap/doris/java8/jre/lib/amd64/server/libjvm.so Missing separate debuginfo for /home/olap/doris/java8/jre/lib/amd64/libverify.so Missing separate debuginfo for /home/olap/doris/java8/jre/lib/amd64/libmanagement.so Dwarf Error: wrong version in compilation unit header (is 5, should be 2, 3, or 4) [in module /home/olap/doris/be/lib/hadoop_hdfs/native/libhadoop.so.1.0.0] Core was generated by `/home/olap/doris/be/lib/doris_be'. Program terminated with signal 11, Segmentation fault.

    0 0x000055adf731ae0a in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::_destroy_tls_blocks() ()

    Missing separate debuginfos, use: debuginfo-install glibc-2.17-317.el7.x86_64 zlib-1.2.7-18.el7.x86_64 (gdb) bt

    0 0x000055adf731ae0a in bvar::detail::AgentGroup<bvar::detail::AgentCombiner<long, long, bvar::detail::AddTo >::Agent>::_destroy_tls_blocks() ()

    1 0x000055adff4ea32e in ?? () 2 0x00007f27d825cca2 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0 3 0x00007f27d825ceb3 in start_thread () from /lib64/libpthread.so.0 4 0x00007f27d8c8b96d in clone () from /lib64/libc.so.6 (gdb)

What You Expected?

no crash

How to Reproduce?

crashed by the below sql, but I can not reproduce it

select query_time AS time, path, (total_count / 60000) 1000 AS qps from ( SELECT (time DIV 60000) 60000 AS query_time, path, count() total_count FROM envoy_log WHERE time >= 1706585467 1000 AND time <= 1706596267 1000 AND apiName = "sirius-auth-server" group by query_time,path) temp ORDER by time asc;

Anything Else?

no

Are you willing to submit PR?

Code of Conduct

xingyingone commented 10 months ago

crashed by the below sql, but I can not reproduce it

select query_time AS time, path, (total_count / 60000) 1000 AS qps from ( SELECT (time DIV 60000) 60000 AS query_time, path, count() total_count FROM envoy_log WHERE time >= 1706585467 1000 AND time <= 1706596267 1000 AND apiName = "sirius-auth-server" group by query_time,path) temp ORDER by time asc;