Open ltregan opened 1 year ago
Hi @ltregan thanks for opening an issue. I'm looking into this today
@ltregan I'm unable to reproduce. Can you check if you are still running into issues?
Still same issue, even after clearing the cache. Exact sequence is:
$ docker system prune -a -f
$ git clone sparkmagic-dev
$ cd sparkmagic-dev
$ docker compose up
I am on Mac M1. Something fishy also is that CPU start at 20% (can be seen in the screenshots at the bottom) then goes up to 40% after a couple of minutes and stay there.
Full log then screenshots below.
sh-5.1# ../bin/pyspark
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
23/03/15 18:45:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Exception in thread "Thread-4" java.lang.ExceptionInInitializerError
at org.apache.hadoop.hive.conf.HiveConf.<clinit>(
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(
at py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(
at py4j.reflection.ReflectionUtil.classForName(
at py4j.reflection.TypeUtil.forName(
at py4j.commands.ReflectionCommand.getUnknownMember(
at py4j.commands.ReflectionCommand.execute(
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.0
at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(
at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(
at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(
... 10 more
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/opt/spark/python/lib/", line 1159, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/spark/python/lib/", line 985, in send_command
response = connection.send_command(command)
File "/opt/spark/python/lib/", line 1164, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.4.7
Using Python version 3.7.11 (default, Jul 27 2021 14:32:16)
SparkSession available as 'spark'.
@ltregan Thanks for the screenshots. I'm able to reproduce
Describe the bug I believe there was a new push of the image by datamechanics (5 days ago ?) and now sparkmagic docker image does not work anymore. If you log to the spark-1 container, and try ../bin/pyspark I get this error:
To Reproduce git clone sparkmagic-dev cd sparkmagic-dev docker compose up
then create a new PySpark notebook and a simple command does not. work. eg. %data = [(1, 'John', 'Doe')]
Expected behavior PySpark kernel should work
Screenshots If applicable, add screenshots to help explain your problem.
Additional context I believe there was a new push of the image by datamechanics (5 days ago ?)