awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 299 forks source link

spark-shell fails in the latest glue_libs_4.0.0 image #209

Open olexanderos opened 4 months ago

olexanderos commented 4 months ago

When I run docker run -it amazon/aws-glue-libs:glue_libs_4.0.0_image_01 spark-shell I'm getting the exception:

Generating self-signed SSL/TLS certificate at /home/glue_user/.certs/container_certs/localhost.jks
Self-signed certificates successfully generated.
starting org.apache.spark.deploy.history.HistoryServer, logging to /home/glue_user/spark/logs/spark-glue_user-org.apache.spark.deploy.history.HistoryServer-1-8d74b5077087.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/glue_user/spark/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/spark/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/aws-glue-libs/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/aws-glue-libs/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/04/24 17:24:46 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://8d74b5077087:4041
Spark context available as 'sc' (master = local[*], app id = local-1713979486653).
Spark session available as 'spark'.
Exception in thread "main" java.lang.NoSuchMethodError: jline.console.completer.CandidateListCompletionHandler.setPrintSpaceAfterFullCompletion(Z)V
        at scala.tools.nsc.interpreter.jline.JLineConsoleReader.initCompletion(JLineReader.scala:143)
        at scala.tools.nsc.interpreter.jline.InteractiveReader.postInit(JLineReader.scala:58)
        at org.apache.spark.repl.SparkILoop.$anonfun$process$3(SparkILoop.scala:144)
        at org.apache.spark.repl.SparkILoop.$anonfun$process$3$adapted(SparkILoop.scala:142)
        at scala.tools.nsc.interpreter.SplashReader.postInit(InteractiveReader.scala:142)
        at org.apache.spark.repl.SparkILoop.$anonfun$process$4(SparkILoop.scala:168)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.tools.nsc.interpreter.ILoop.$anonfun$mumly$1(ILoop.scala:166)
        at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:206)
        at scala.tools.nsc.interpreter.ILoop.mumly(ILoop.scala:163)
        at org.apache.spark.repl.SparkILoop.loopPostInit$1(SparkILoop.scala:153)
        at org.apache.spark.repl.SparkILoop.$anonfun$process$10(SparkILoop.scala:221)
        at org.apache.spark.repl.SparkILoop.withSuppressedSettings$1(SparkILoop.scala:189)
        at org.apache.spark.repl.SparkILoop.startup$1(SparkILoop.scala:201)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:236)
        at org.apache.spark.repl.Main$.doMain(Main.scala:78)
        at org.apache.spark.repl.Main$.main(Main.scala:58)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1006)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1095)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1104)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Comparing it with a v3 docker run -it amazon/aws-glue-libs:glue_libs_3.0.0_image_01 spark-shell it works as expected:

...
Digest: sha256:6d255d793c232ed6ecb1622cc399c589f26d2d3b0b6164dccc1f691921344508
Status: Downloaded newer image for amazon/aws-glue-libs:glue_libs_3.0.0_image_01
Generating self-signed SSL/TLS certificate at /home/glue_user/.certs/localhost.jks
Self-signed certificates successfully generated.
MAC verified OK
MAC verified OK
MAC verified OK
starting org.apache.spark.deploy.history.HistoryServer, logging to /home/glue_user/spark/logs/spark-glue_user-org.apache.spark.deploy.history.HistoryServer-1-bd71929bb550.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/glue_user/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/aws-glue-libs/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
24/04/24 17:16:35 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Spark context Web UI available at http://bd71929bb550:4041
Spark context available as 'sc' (master = local[*], app id = local-1713978995491).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.1.1-amzn-0
      /_/

Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_352)
Type in expressions to have them evaluated.
Type :help for more information.

scala>