amplab / spark-ec2

Scripts used to setup a Spark cluster on EC2
Apache License 2.0
392 stars 299 forks source link

Getting InvalidClassException when running SparkPi example locally but pointing to master on AWS #92

Open matthewadams opened 7 years ago

matthewadams commented 7 years ago

I'm getting the error below when attempting to bin/run-example --master spark://ec2-000-000-000-000.compute-1.amazonaws.com:7077 SparkPi 10 from my local machine (ips changed to protect the innocent). This was run with a custom build/distribution of spark 2.0.2 using scala 2.10 with hadoop 2.4 (via build commands ./dev/change-scala-version.sh 2.10 && ./dev/make-distribution.sh --name custom-spark --tgz -Phadoop-2.4 -Dscala-2.10 -DskipTests).

Note that this also happens using vanilla binary distributions of spark 2.0.2 & 2.1.0 (as mentioned in https://github.com/amplab/spark-ec2/issues/91#issue-212624665).

$ bin/run-example --master spark://ec2-000-000-000-000.compute-1.amazonaws.com:7077 SparkPi 10
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/matthew/Documents/spark-2.0.2-bin-custom-spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/matthew/Documents/spark-2.0.2/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/03/08 10:16:49 INFO spark.SparkContext: Running Spark version 2.0.2
17/03/08 10:16:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/08 10:16:50 INFO spark.SecurityManager: Changing view acls to: matthew
17/03/08 10:16:50 INFO spark.SecurityManager: Changing modify acls to: matthew
17/03/08 10:16:50 INFO spark.SecurityManager: Changing view acls groups to:
17/03/08 10:16:50 INFO spark.SecurityManager: Changing modify acls groups to:
17/03/08 10:16:50 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(matthew); groups with view permissions: Set(); users  with modify permissions: Set(matthew); groups with modify permissions: Set()
17/03/08 10:16:51 INFO util.Utils: Successfully started service 'sparkDriver' on port 56731.
17/03/08 10:16:51 INFO spark.SparkEnv: Registering MapOutputTracker
17/03/08 10:16:51 INFO spark.SparkEnv: Registering BlockManagerMaster
17/03/08 10:16:51 INFO storage.DiskBlockManager: Created local directory at /private/var/folders/8c/4kr7cmf109b4778xj0sxct8w0000gn/T/blockmgr-85536df3-360a-4a21-9ad3-eadaa5efec32
17/03/08 10:16:51 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
17/03/08 10:16:51 INFO spark.SparkEnv: Registering OutputCommitCoordinator
17/03/08 10:16:51 INFO util.log: Logging initialized @2615ms
17/03/08 10:16:51 INFO server.Server: jetty-9.2.z-SNAPSHOT
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@27953a83{/jobs,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@556d0826{/jobs/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@66ce957f{/jobs/job,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@55b5f5d2{/jobs/job/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5bfa8cc5{/stages,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@666b83a4{/stages/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@749c877b{/stages/stage,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@efde75f{/stages/stage/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ecee1{/stages/pool,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3b220bcb{/stages/pool/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2b95e48b{/storage,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a3329b9{/storage/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3dddefd8{/storage/rdd,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@160ac7fb{/storage/rdd/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@12bfd80d{/environment,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@41925502{/environment/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@13e3c1c7{/executors,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5316e95f{/executors/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3f053c80{/executors/threadDump,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6c6c5427{/executors/threadDump/json,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@618c5d94{/static,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5b40ceb{/,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@13c3c1e1{/api,null,AVAILABLE}
17/03/08 10:16:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1d8062d2{/stages/stage/kill,null,AVAILABLE}
17/03/08 10:16:51 INFO server.ServerConnector: Started ServerConnector@3e34ace1{HTTP/1.1}{0.0.0.0:4040}
17/03/08 10:16:51 INFO server.Server: Started @2796ms
17/03/08 10:16:51 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
17/03/08 10:16:51 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.86.165:4040
17/03/08 10:16:51 INFO spark.SparkContext: Added JAR file:/Users/matthew/Documents/spark-2.0.2-bin-custom-spark/./examples/jars/scopt_2.10-3.3.0.jar at spark://192.168.86.165:56731/jars/scopt_2.10-3.3.0.jar with timestamp 1488989811622
17/03/08 10:16:51 INFO spark.SparkContext: Added JAR file:/Users/matthew/Documents/spark-2.0.2-bin-custom-spark/./examples/jars/spark-examples_2.10-2.0.2.jar at spark://192.168.86.165:56731/jars/spark-examples_2.10-2.0.2.jar with timestamp 1488989811623
17/03/08 10:16:51 INFO client.StandaloneAppClient$ClientEndpoint: Connecting to master spark://ec2-000-000-000-000.compute-1.amazonaws.com:7077...
17/03/08 10:16:51 INFO client.TransportClientFactory: Successfully created connection to ec2-000-000-000-000.compute-1.amazonaws.com/54.242.200.97:7077 after 89 ms (0 ms spent in bootstraps)
17/03/08 10:16:52 WARN client.StandaloneAppClient$ClientEndpoint: Failed to connect to master ec2-000-000-000-000.compute-1.amazonaws.com:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
    at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:88)
    at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:96)
    at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.InvalidClassException: org.apache.spark.rpc.netty.RequestMessage; local class incompatible: stream classdesc serialVersionUID = -2221986757032131007, local class serialVersionUID = -5447855329526097695
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:612)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1827)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1711)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1982)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1533)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:420)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
    at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:259)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:308)
    at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:258)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:257)
    at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:578)
    at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:563)
    at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:158)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:106)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    at java.lang.Thread.run(Thread.java:745)

    at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:189)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
    ... 1 more
^C17/03/08 10:17:07 INFO storage.DiskBlockManager: Shutdown hook called
17/03/08 10:17:07 INFO util.ShutdownHookManager: Shutdown hook called
17/03/08 10:17:07 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/8c/4kr7cmf109b4778xj0sxct8w0000gn/T/spark-3ee19a3f-06d8-49d0-bfeb-107440f39b86/userFiles-8766d2fb-8a8f-4743-9fe2-c0a8194d5d4e
17/03/08 10:17:07 INFO util.ShutdownHookManager: Deleting directory /private/var/folders/8c/4kr7cmf109b4778xj0sxct8w0000gn/T/spark-3ee19a3f-06d8-49d0-bfeb-107440f39b86

It is worth noting that if I ssh into the master and run the example from there, everything works fine:

$ ssh -i ~/.ssh/spark-test.pem root@ec2-000-000-000-000.compute-1.amazonaws.com
The authenticity of host 'ec2-000-000-000-000.compute-1.amazonaws.com (000.000.000.000)' can't be established.
ECDSA key fingerprint is SHA256:oJSRdeS1dzKORChh/lIQPTEEbR1y7zOxsH6q8f2beqY.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-000-000-000-000.compute-1.amazonaws.com,000.000.000.000' (ECDSA) to the list of known hosts.
Last login: Wed Mar  8 16:07:52 2017 from ip-172-31-35-30.ec2.internal

       __|  __|_  )
       _|  (     /   Amazon Linux AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-ami/2013.03-release-notes/
Amazon Linux version 2016.09 is available.
root@ip-172-31-35-30 ~]$ cd spark
root@ip-172-31-35-30 spark]$ ls -al
total 116
drwxr-xr-x 13 hadoop ec2-user  4096 Mar  8 16:11 .
drwxr-xr-x 18 root   root      4096 Mar  8 16:11 ..
drwxr-xr-x  2 hadoop ec2-user  4096 Jul 19  2016 bin
drwxr-xr-x  2 hadoop ec2-user  4096 Mar  8 16:11 conf
drwxr-xr-x  5 hadoop ec2-user  4096 Jul 19  2016 data
drwxr-xr-x  4 hadoop ec2-user  4096 Jul 19  2016 examples
drwxr-xr-x  2 hadoop ec2-user 12288 Jul 19  2016 jars
-rw-r--r--  1 hadoop ec2-user 17811 Jul 19  2016 LICENSE
drwxr-xr-x  2 hadoop ec2-user  4096 Jul 19  2016 licenses
drwxr-xr-x  2 root   root      4096 Mar  8 16:11 logs
-rw-r--r--  1 hadoop ec2-user 24749 Jul 19  2016 NOTICE
drwxr-xr-x  6 hadoop ec2-user  4096 Jul 19  2016 python
drwxr-xr-x  3 hadoop ec2-user  4096 Jul 19  2016 R
-rw-r--r--  1 hadoop ec2-user  3828 Jul 19  2016 README.md
-rw-r--r--  1 hadoop ec2-user   120 Jul 19  2016 RELEASE
drwxr-xr-x  2 hadoop ec2-user  4096 Jul 19  2016 sbin
drwxr-xr-x  2 hadoop ec2-user  4096 Jul 19  2016 yarn
root@ip-172-31-35-30 spark]$ echo $PATH
/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/scala/bin
root@ip-172-31-35-30 spark]$ bin/run-example SparkPi 10
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/03/08 16:22:04 INFO SparkContext: Running Spark version 2.0.0
17/03/08 16:22:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/03/08 16:22:05 INFO SecurityManager: Changing view acls to: root
17/03/08 16:22:05 INFO SecurityManager: Changing modify acls to: root
17/03/08 16:22:05 INFO SecurityManager: Changing view acls groups to:
17/03/08 16:22:05 INFO SecurityManager: Changing modify acls groups to:
17/03/08 16:22:05 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
17/03/08 16:22:06 INFO Utils: Successfully started service 'sparkDriver' on port 51608.
17/03/08 16:22:06 INFO SparkEnv: Registering MapOutputTracker
17/03/08 16:22:06 INFO SparkEnv: Registering BlockManagerMaster
17/03/08 16:22:06 INFO DiskBlockManager: Created local directory at /mnt/spark/blockmgr-dc6fefba-3b3c-4f16-8383-b09687429150
17/03/08 16:22:06 INFO DiskBlockManager: Created local directory at /mnt2/spark/blockmgr-718052d3-6022-4f55-8d30-0673994c1f25
17/03/08 16:22:06 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
17/03/08 16:22:06 INFO SparkEnv: Registering OutputCommitCoordinator
17/03/08 16:22:07 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/03/08 16:22:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://ec2-54-242-200-97.compute-1.amazonaws.com:4040
17/03/08 16:22:07 INFO SparkContext: Added JAR file:/root/spark/examples/jars/spark-examples_2.11-2.0.0.jar at spark://172.31.35.30:51608/jars/spark-examples_2.11-2.0.0.jar with timestamp 1488990127275
17/03/08 16:22:07 INFO SparkContext: Added JAR file:/root/spark/examples/jars/scopt_2.11-3.3.0.jar at spark://172.31.35.30:51608/jars/scopt_2.11-3.3.0.jar with timestamp 1488990127276
17/03/08 16:22:07 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://ec2-54-242-200-97.compute-1.amazonaws.com:7077...
17/03/08 16:22:07 INFO TransportClientFactory: Successfully created connection to ec2-54-242-200-97.compute-1.amazonaws.com/172.31.35.30:7077 after 67 ms (0 ms spent in bootstraps)
17/03/08 16:22:07 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20170308162207-0000
17/03/08 16:22:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 47935.
17/03/08 16:22:07 INFO NettyBlockTransferService: Server created on 172.31.35.30:47935
17/03/08 16:22:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.31.35.30, 47935)
17/03/08 16:22:07 INFO BlockManagerMasterEndpoint: Registering block manager 172.31.35.30:47935 with 366.3 MB RAM, BlockManagerId(driver, 172.31.35.30, 47935)
17/03/08 16:22:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.31.35.30, 47935)
17/03/08 16:22:07 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20170308162207-0000/0 on worker-20170308161225-172.31.41.253-58836 (172.31.41.253:58836) with 2 cores
17/03/08 16:22:07 INFO StandaloneSchedulerBackend: Granted executor ID app-20170308162207-0000/0 on hostPort 172.31.41.253:58836 with 2 cores, 6.0 GB RAM
17/03/08 16:22:08 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20170308162207-0000/0 is now RUNNING
17/03/08 16:22:08 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
17/03/08 16:22:08 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
17/03/08 16:22:08 INFO SharedState: Warehouse path is 'file:/root/spark/spark-warehouse'.
17/03/08 16:22:08 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
17/03/08 16:22:08 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
17/03/08 16:22:08 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
17/03/08 16:22:08 INFO DAGScheduler: Parents of final stage: List()
17/03/08 16:22:08 INFO DAGScheduler: Missing parents: List()
17/03/08 16:22:08 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
17/03/08 16:22:09 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB)
17/03/08 16:22:09 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB)
17/03/08 16:22:09 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.31.35.30:47935 (size: 1169.0 B, free: 366.3 MB)
17/03/08 16:22:09 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
17/03/08 16:22:09 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34)
17/03/08 16:22:09 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks
17/03/08 16:22:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(null) (172.31.41.253:59470) with ID 0
17/03/08 16:22:11 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 172.31.41.253, partition 0, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:11 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 172.31.41.253, partition 1, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 0 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 1 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:11 INFO BlockManagerMasterEndpoint: Registering block manager 172.31.41.253:48508 with 3.0 GB RAM, BlockManagerId(0, 172.31.41.253, 48508)
17/03/08 16:22:12 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.31.41.253:48508 (size: 1169.0 B, free: 3.0 GB)
17/03/08 16:22:13 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, 172.31.41.253, partition 2, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 2 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, 172.31.41.253, partition 3, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 3 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1567 ms on 172.31.41.253 (1/10)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1527 ms on 172.31.41.253 (2/10)
17/03/08 16:22:13 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, 172.31.41.253, partition 4, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 4 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, 172.31.41.253, partition 5, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 5 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 140 ms on 172.31.41.253 (3/10)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 138 ms on 172.31.41.253 (4/10)
17/03/08 16:22:13 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, 172.31.41.253, partition 6, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 91 ms on 172.31.41.253 (5/10)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 6 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, 172.31.41.253, partition 7, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 91 ms on 172.31.41.253 (6/10)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 7 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, 172.31.41.253, partition 8, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 97 ms on 172.31.41.253 (7/10)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 8 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, 172.31.41.253, partition 9, PROCESS_LOCAL, 5476 bytes)
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Launching task 9 on executor id: 0 hostname: 172.31.41.253.
17/03/08 16:22:13 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 95 ms on 172.31.41.253 (8/10)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 95 ms on 172.31.41.253 (9/10)
17/03/08 16:22:13 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 87 ms on 172.31.41.253 (10/10)
17/03/08 16:22:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/03/08 16:22:13 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 4.653 s
17/03/08 16:22:13 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 4.983803 s
Pi is roughly 3.1415351415351416
17/03/08 16:22:13 INFO SparkUI: Stopped Spark web UI at http://ec2-54-242-200-97.compute-1.amazonaws.com:4040
17/03/08 16:22:13 INFO StandaloneSchedulerBackend: Shutting down all executors
17/03/08 16:22:13 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
17/03/08 16:22:13 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/03/08 16:22:13 INFO MemoryStore: MemoryStore cleared
17/03/08 16:22:13 INFO BlockManager: BlockManager stopped
17/03/08 16:22:13 INFO BlockManagerMaster: BlockManagerMaster stopped
17/03/08 16:22:13 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/03/08 16:22:13 INFO SparkContext: Successfully stopped SparkContext
17/03/08 16:22:13 INFO ShutdownHookManager: Shutdown hook called
17/03/08 16:22:13 INFO ShutdownHookManager: Deleting directory /mnt/spark/spark-f7efb8fc-a042-4892-8f47-f6ff0a1c4ecc
17/03/08 16:22:13 INFO ShutdownHookManager: Deleting directory /mnt2/spark/spark-781f6c72-86b1-4b54-b8d8-d705c7796bf0
root@ip-172-31-35-30 spark]$
shivaram commented 7 years ago

I dont think spark-ec2 is designed to run with the client on your machine and a cluster on EC2. Its designed to run with the client on the master machine and the cluster on EC2. FWIW I think the version of Spark running on cluster needs to exactly match the version on the client for things to work -- that is anyway a question for the user mailing list / stack overflow.

matthewadams commented 7 years ago

@shivaram said:

the version of Spark running on cluster needs to exactly match the version on the client for things to work

@shivaram, that is exactly what I did. I ensured that my local scala, spark & hadoop versions exactly matched what was on the server.

Since I've burned so much time on spark-ec2, I'm now trying to use Amazon EMR and having good success. Pointers to comparisons between spark-ec2 and Spark on Amazon EMR would be appreciated.