apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.18k stars 428 forks source link

[VL] libgluten.so crash while building gluten velox #6088

Open Au-Miner opened 4 months ago

Au-Miner commented 4 months ago

Problem description

Backend Velox

Bug description I built velox gluten according to the requirements of the official website, but encountered an error of 'C [libgluten. so+0x317353]' during startup

Reproduction Create the docker container

docker pull ubuntu:22.04
docker run -itd --name ubuntu2204 ubuntu:22.04 /bin/bash
docker attach ubuntu2204

apt-get update
apt install software-properties-common
apt install maven build-essential cmake libssl-dev libre2-dev libcurl4-openssl-dev clang lldb lld libz-dev git ninja-build uuid-dev autoconf-archive curl zip unzip tar pkg-config bison libtool flex vim
apt install sudo
apt purge libjemalloc-dev libjemalloc2 librust-jemalloc-sys-dev
apt install -y openjdk-8-jdk
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$JAVA_HOME/bin:$PATH

<download spark/spark-3.2.0-bin-hadoop3.2.tgz>
<download gluten-velox-bundle-spark3.2_2.12-1.1.1.jar>

/spark/spark/bin/spark-shell --name run_gluten \
 --master local --deploy-mode client \
 --conf spark.plugins=io.glutenproject.GlutenPlugin \
 --conf spark.memory.offHeap.enabled=true \
 --conf spark.memory.offHeap.size=20g \
 --jars /spark/gluten-velox-bundle-spark3.2_2.12-1.1.1.jar \
 --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager

error message

root@8c55a15ec775:/spark/spark# /spark/spark/bin/spark-shell --name run_gluten \
 --master local --deploy-mode client \
 --conf spark.plugins=io.glutenproject.GlutenPlugin \
 --conf spark.memory.offHeap.enabled=true \
 --conf spark.memory.offHeap.size=20g \
 --jars /spark/gluten-velox-bundle-spark3.2_2.12-1.1.1.jar \
 --conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager
24/06/14 01:39:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/gluten-7cca4e91-463a-4240-b9e4-36f67b147160/jni/11f04b97-1abd-4656-aeeb-0bedfe813394/gluten-984801035636929938/libvelox.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007f2841964353, pid=42017, tid=0x00007f28da678640
#
# JRE version: OpenJDK Runtime Environment (8.0_412-b08) (build 1.8.0_412-8u412-ga-1~22.04.1-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.412-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libgluten.so+0x317353]  gluten::Runtime::registerFactory(std::string const&, std::function<gluten::Runtime* (std::unordered_map<std::string, std::string, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&)>)+0x23
#
# Core dump written. Default location: /spark/spark/core or core.42017
#
# An error report file with more information is saved as:
# /spark/spark/hs_err_pid42017.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
/spark/spark/bin/spark-shell: line 47: 42017 Aborted                 (core dumped) "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"

What's the problem?

System information

server: centos7 container: ubuntu22.04

CMake log

No response

weiting-chen commented 4 months ago

Since the source has been transferred to Apache, if you are using the latest source code to compile your jar, please use "--conf spark.plugins=org.apache.gluten.GlutenPlugin" instead of "spark.plugins=io.glutenproject.GlutenPlugin".

ArnavBalyan commented 3 months ago

Hi @weiting-chen @PHILO-HE, this issue is reported at multiple places: https://github.com/apache/incubator-gluten/issues/5327, https://github.com/apache/incubator-gluten/issues/6088. Being encountered when using released jar with: spark.plugins=io.glutenproject.GlutenPlugin, can you please take a look

Au-Miner commented 3 months ago

thanks

my7ym commented 3 months ago

https://github.com/apache/incubator-gluten/issues/5327#issuecomment-2223383863