jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.32k stars 446 forks source link

Create official Docker images (was: docker-compose build failure) #551

Open devender-yadav opened 5 years ago

devender-yadav commented 5 years ago
docker-compose build 

Not able to build spark core.

Error logs:

[error] /apps/build/spark/core/src/test/java/test/org/apache/spark/JavaAPISuite.java:36: warning: [deprecation] Accumulator in org.apache.spark has been deprecated [error] import org.apache.spark.Accumulator; [error] ^ [error] /apps/build/spark/core/src/test/java/test/org/apache/spark/JavaAPISuite.java:37: warning: [deprecation] AccumulatorParam in org.apache.spark has been deprecated [error] import org.apache.spark.AccumulatorParam; [error] ^ [error] Compile failed at Jul 4, 2019 8:47:28 AM [1:25.010s] [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [05:03 min] [INFO] Spark Project Tags ................................. SUCCESS [ 35.560 s] [INFO] Spark Project Sketch ............................... SUCCESS [ 4.462 s] [INFO] Spark Project Local DB ............................. SUCCESS [ 20.445 s] [INFO] Spark Project Networking ........................... SUCCESS [01:03 min] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 5.563 s] [INFO] Spark Project Unsafe ............................... SUCCESS [ 24.045 s] [INFO] Spark Project Launcher ............................. SUCCESS [01:39 min] [INFO] Spark Project Core ................................. FAILURE [05:06 min] [INFO] Spark Project ML Local Library ..................... SUCCESS [ 38.587 s] [INFO] Spark Project GraphX ............................... SKIPPED [INFO] Spark Project Streaming ............................ SKIPPED [INFO] Spark Project Catalyst ............................. SKIPPED [INFO] Spark Project SQL .................................. SKIPPED [INFO] Spark Project ML Library ........................... SKIPPED [INFO] Spark Project Tools ................................ SUCCESS [ 34.015 s] [INFO] Spark Project Hive ................................. SKIPPED [INFO] Spark Project REPL ................................. SKIPPED [INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 31.793 s] [INFO] Spark Project YARN ................................. SKIPPED [INFO] Spark Project Hive Thrift Server ................... SKIPPED [INFO] Spark Project Assembly ............................. SKIPPED [INFO] Spark Integration for Kafka 0.10 ................... SKIPPED [INFO] Kafka 0.10 Source for Structured Streaming ......... SKIPPED [INFO] Spark Project Examples ............................. SKIPPED [INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 12:26 min (Wall Clock) [INFO] Finished at: 2019-07-04T08:47:28+00:00 [INFO] Final Memory: 57M/764M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile (scala-test-compile-first) on project spark-core_2.11: Execution scala-test-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile failed. CompileFailed -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :spark-core_2.11 ERROR: Service 'spark' failed to build: The command '/bin/sh -c mkdir -p /apps/build && cd /apps/build && git clone https://github.com/apache/spark.git spark && cd $SPARK_BUILD_PATH && git checkout v$SPARK_BUILD_VERSION && dev/make-distribution.sh --name spark-$SPARK_BUILD_VERSION -Phive -Phive-thriftserver -Pyarn && cp -r /apps/build/spark/dist $SPARK_HOME && rm -rf $SPARK_BUILD_PATH' returned a non-zero code: 1

apetresc commented 5 years ago

I cannot seem to reproduce this. Are you building from the latest master?

devender-yadav commented 5 years ago

@apetresc yes from the master? Did you try building the image again?

apetresc commented 5 years ago

Yes, I just tried again with docker-compose build --no-cache spark to make sure I didn't have some cached layer containing a different Java version, and it still worked fine:

Successfully built 9a115fae44fc 
Successfully tagged jupyter/sparkmagic-livy:latest

Can you try with the cache disabled, just to try and rule that out?

devender-yadav commented 5 years ago

@apetresc I'll try tomorrow. I am wondering if I can pull it from dockerhub directly. I am suggesting AWS to update sparkmagic version for EMR notebook. I just wanna try the latest version once at my side first.

devender-yadav commented 5 years ago

@itamarst can we update sparkmagic image to dockerhub. I don't know why but I am still getting errors in spark code compilation in docker compose build step.

itamarst commented 5 years ago

@devender-yadav which image are you referring to exactly? What name? (I'm new to the project, so don't know what images there are or how they're built).

itamarst commented 5 years ago

This builds fine for me too when I do --no-cache on master.

devender-yadav commented 5 years ago

@itamarst It's failing while building Dockerfile.spark in the below-mentioned step:

RUN mkdir -p /apps/build && \
  cd /apps/build && \
  git clone https://github.com/apache/spark.git spark && \
  cd $SPARK_BUILD_PATH && \
  git checkout v$SPARK_BUILD_VERSION && \
  dev/make-distribution.sh --name spark-$SPARK_BUILD_VERSION -Phive -Phive-thriftserver -Pyarn && \
  cp -r /apps/build/spark/dist $SPARK_HOME && \
  rm -rf $SPARK_BUILD_PATH
ragelo commented 5 years ago

Same issue: [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (scala-compile-first) on project spark-graphx_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed. CompileFailed when run docker build -t sparkmagic-livy -f ./Dockerfile.spark .

itamarst commented 5 years ago

Fascinating.

@ragelo

  1. Is this first time you've ever built this image, or have you done so before on this machine?
  2. Can you include output of git log -1.
ragelo commented 5 years ago

@itamarst

  1. Yes, it's the first time for this image (I also did 3-4 retries after).
  2. commit ee092134588287436607d6c7197ba61b4e3856f3 (HEAD -> master, origin/master)
    Merge: b771129 767c581
    Author: Itamar Turner-Trauring <itamar@itamarst.org>
    Date:   Fri Jul 19 14:33:14 2019 -0400
    
    Merge pull request #555 from jupyter-incubator/0.12.9
    
    Prepare for 0.12.9.
itamarst commented 5 years ago

Oh, I wonder if I couldn't reproduce it because I did --no-cache but I didn't also do --pull. Will investigate later this week.

itamarst commented 5 years ago

So I just did:

$ sudo docker build --no-cache --pull -t jupyter/sparkmagic-livy -f Dockerfile.spark .

And it worked fine. So I have no idea what's up.

@devender-yadav you mentioned dockerhub, do you know of an official docker image? If not I can upload one.

devender-yadav commented 5 years ago

@itamarst I don't see any official image at dockerhub. On searching sparkmagic, I found https://hub.docker.com/r/heliumdatacommons/sparkmagic but this is updated year back and not sure if it is reliable.

It's good to keep the latest image at dockerhub so that we can directly pull from there.

ibiris commented 4 years ago

I am having this issue - but it seems to fail at a different step. This is the first time I've ever built this image on this machine.

I am attaching the log from that build command.

dockerbuild-spark.log

devstein commented 2 years ago

Relates to #756

insider89 commented 1 year ago

Is there a plan to push sparkmagic image to some Docker registry?