linkedin / openhouse

Open Control Plane for Tables in Data Lakehouse
https://www.openhousedb.org/
BSD 2-Clause "Simplified" License
311 stars 52 forks source link

[BUG] Local docker failed to run spark-shell on Mac M1 #115

Open thinh2 opened 5 months ago

thinh2 commented 5 months ago

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

OpenHouse version

v0.5.62

System information

Describe the problem

While running spark-shell commands in the SETUP.md, it always prompts a fatal error related to Java Runtime Environment.

After investigation, I found that it is a common docker issue on Apple Silicon Macbook due to a bug in Rosseta (the x86/amd64 emulation application on Apple Silicon).

More details about this issue can be found in https://github.com/docker/for-mac/issues/7006

While waiting for the fix from Apple, there are several workarounds for this issue. For me, downgrading the Docker to [https://docs.docker.com/desktop/release-notes/#4272](version 4.27.2) will work. Additionally, other methods are mentioned in https://github.com/docker/for-mac/issues/7006#issuecomment-2122869966.

Stacktrace, metrics and logs

A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007ffffe0b8e1e, pid=692, tid=0x00007fffe86e6700
#
# JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-8u232-b09-1~deb9u1-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x628e1e]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /opt/spark/hs_err_pid692.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp

Code to reproduce bug

bin/spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.1_2.12:1.2.0   \
  --jars openhouse-spark-runtime_2.12-*-all.jar  \
  --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,com.linkedin.openhouse.spark.extensions.OpenhouseSparkSessionExtensions   \
  --conf spark.sql.catalog.openhouse=org.apache.iceberg.spark.SparkCatalog   \
  --conf spark.sql.catalog.openhouse.catalog-impl=com.linkedin.openhouse.spark.OpenHouseCatalog     \
  --conf spark.sql.catalog.openhouse.metrics-reporter-impl=com.linkedin.openhouse.javaclient.OpenHouseMetricsReporter    \
  --conf spark.sql.catalog.openhouse.uri=http://openhouse-tables:8080   \
  --conf spark.sql.catalog.openhouse.auth-token=$(cat /var/config/$(whoami).token) \
  --conf spark.sql.catalog.openhouse.cluster=LocalHadoopCluster

What component does this bug affect?

kmcclenn commented 5 months ago

I ran into the same issue, but downgrading docker like suggested also worked for me to fix it.

ctrezzo commented 4 months ago

There is now an internal build of Docker 4.32 that resolved this issue for me: https://github.com/docker/for-mac/issues/7006#issuecomment-2163112416

Hopefully there will be an official release soon!