gettyimages / docker-spark

Docker build for Apache Spark
MIT License
676 stars 370 forks source link

spark-submit doesn't work with python because different python verisons on driver/worker #58

Closed fernandezpablo85 closed 5 years ago

fernandezpablo85 commented 5 years ago

How to reproduce:

1) Clone repo, docker-compose up.

2) Download spark-2.4.1-bin-hadoop2.7 from the releases page

3) Submit python pi.py example:

./bin/spark-submit --master spark://localhost:7077 ./examples/src/main/python/pi.py 10

4) It crashes:

➜ spark-2.4.1-bin-hadoop2.7 ./bin/spark-submit --master spark://localhost:7077 ./examples/src/main/python/pi.py 10 19/04/25 19:19:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/04/25 19:19:55 INFO SparkContext: Running Spark version 2.4.1 19/04/25 19:19:55 INFO SparkContext: Submitted application: PythonPi 19/04/25 19:19:56 INFO SecurityManager: Changing view acls to: pablo.fernandez 19/04/25 19:19:56 INFO SecurityManager: Changing modify acls to: pablo.fernandez 19/04/25 19:19:56 INFO SecurityManager: Changing view acls groups to: 19/04/25 19:19:56 INFO SecurityManager: Changing modify acls groups to: 19/04/25 19:19:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(pablo.fernandez); groups with view permissions: Set(); users with modify permissions: Set(pablo.fernandez); groups with modify permissions: Set() 19/04/25 19:19:56 INFO Utils: Successfully started service 'sparkDriver' on port 61681. 19/04/25 19:19:56 INFO SparkEnv: Registering MapOutputTracker 19/04/25 19:19:56 INFO SparkEnv: Registering BlockManagerMaster 19/04/25 19:19:56 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/04/25 19:19:56 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/04/25 19:19:56 INFO DiskBlockManager: Created local directory at /private/var/folders/bf/62kzgm4n4sx1x1wk6vjpwm6ndrr_bx/T/blockmgr-16cee74e-3fd6-4685-9c73-ae4060f63f27 19/04/25 19:19:56 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 19/04/25 19:19:56 INFO SparkEnv: Registering OutputCommitCoordinator 19/04/25 19:19:56 INFO Utils: Successfully started service 'SparkUI' on port 4040. 19/04/25 19:19:56 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.254.17.9:4040 19/04/25 19:19:56 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://localhost:7077... 19/04/25 19:19:56 INFO TransportClientFactory: Successfully created connection to localhost/127.0.0.1:7077 after 40 ms (0 ms spent in bootstraps) 19/04/25 19:19:56 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20190425221956-0004 19/04/25 19:19:56 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20190425221956-0004/0 on worker-20190425221434-172.18.0.3-8881 (172.18.0.3:8881) with 2 core(s) 19/04/25 19:19:56 INFO StandaloneSchedulerBackend: Granted executor ID app-20190425221956-0004/0 on hostPort 172.18.0.3:8881 with 2 core(s), 1024.0 MB RAM 19/04/25 19:19:56 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61683. 19/04/25 19:19:56 INFO NettyBlockTransferService: Server created on 10.254.17.9:61683 19/04/25 19:19:56 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/04/25 19:19:56 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20190425221956-0004/0 is now RUNNING 19/04/25 19:19:56 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.254.17.9, 61683, None) 19/04/25 19:19:56 INFO BlockManagerMasterEndpoint: Registering block manager 10.254.17.9:61683 with 366.3 MB RAM, BlockManagerId(driver, 10.254.17.9, 61683, None) 19/04/25 19:19:56 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.254.17.9, 61683, None) 19/04/25 19:19:56 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.254.17.9, 61683, None) 19/04/25 19:19:57 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 19/04/25 19:19:57 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/private/tmp/spark/spark-2.4.1-bin-hadoop2.7/spark-warehouse'). 19/04/25 19:19:57 INFO SharedState: Warehouse path is 'file:/private/tmp/spark/spark-2.4.1-bin-hadoop2.7/spark-warehouse'. 19/04/25 19:19:57 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/04/25 19:19:58 INFO SparkContext: Starting job: reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44 19/04/25 19:19:58 INFO DAGScheduler: Got job 0 (reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44) with 10 output partitions 19/04/25 19:19:58 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44) 19/04/25 19:19:58 INFO DAGScheduler: Parents of final stage: List() 19/04/25 19:19:58 INFO DAGScheduler: Missing parents: List() 19/04/25 19:19:58 INFO DAGScheduler: Submitting ResultStage 0 (PythonRDD[1] at reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44), which has no missing parents 19/04/25 19:19:58 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 366.3 MB) 19/04/25 19:19:58 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 4.2 KB, free 366.3 MB) 19/04/25 19:19:58 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.254.17.9:61683 (size: 4.2 KB, free: 366.3 MB) 19/04/25 19:19:58 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161 19/04/25 19:19:58 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (PythonRDD[1] at reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)) 19/04/25 19:19:58 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks 19/04/25 19:20:00 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.254.17.9:61686) with ID 0 19/04/25 19:20:00 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 172.18.0.3, executor 0, partition 0, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:00 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 172.18.0.3, executor 0, partition 1, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:00 INFO BlockManagerMasterEndpoint: Registering block manager 172.18.0.3:40993 with 434.4 MB RAM, BlockManagerId(0, 172.18.0.3, 40993, None) 19/04/25 19:20:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.18.0.3:40993 (size: 4.2 KB, free: 434.4 MB) 19/04/25 19:20:01 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, 172.18.0.3, executor 0, partition 2, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:01 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, 172.18.0.3, executor 0, partition 3, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.18.0.3, executor 0): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:452) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:588) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:571) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:406) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 1] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 4, 172.18.0.3, executor 0, partition 0, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 2] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 2.1 in stage 0.0 (TID 5, 172.18.0.3, executor 0, partition 2, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 3] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 3.1 in stage 0.0 (TID 6, 172.18.0.3, executor 0, partition 3, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 4) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 4] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 7, 172.18.0.3, executor 0, partition 0, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 2.1 in stage 0.0 (TID 5) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 5] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 2.2 in stage 0.0 (TID 8, 172.18.0.3, executor 0, partition 2, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 7) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 6] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 9, 172.18.0.3, executor 0, partition 0, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 3.1 in stage 0.0 (TID 6) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 7] 19/04/25 19:20:02 INFO TaskSetManager: Starting task 3.2 in stage 0.0 (TID 10, 172.18.0.3, executor 0, partition 3, PROCESS_LOCAL, 7856 bytes) 19/04/25 19:20:02 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 9) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 8] 19/04/25 19:20:02 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job 19/04/25 19:20:02 INFO TaskSetManager: Lost task 2.2 in stage 0.0 (TID 8) on 172.18.0.3, executor 0: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. ) [duplicate 9] 19/04/25 19:20:02 INFO TaskSchedulerImpl: Cancelling stage 0 19/04/25 19:20:02 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage cancelled 19/04/25 19:20:02 INFO TaskSchedulerImpl: Stage 0 was cancelled 19/04/25 19:20:02 INFO DAGScheduler: ResultStage 0 (reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44) failed in 3.724 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, 172.18.0.3, executor 0): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:452) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:588) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:571) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:406) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) Driver stacktrace: 19/04/25 19:20:02 INFO DAGScheduler: Job 0 failed: reduce at /private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py:44, took 3.786224 s Traceback (most recent call last): File "/private/tmp/spark/spark-2.4.1-bin-hadoop2.7/./examples/src/main/python/pi.py", line 44, in count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add) File "/tmp/spark/spark-2.4.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 844, in reduce File "/tmp/spark/spark-2.4.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 816, in collect File "/tmp/spark/spark-2.4.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/tmp/spark/spark-2.4.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/tmp/spark/spark-2.4.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, 172.18.0.3, executor 0): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:452) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:588) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:571) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:406) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:835) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2082) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.RDD.collect(RDD.scala:944) at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:166) at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/spark-2.4.1/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main ("%d.%d" % sys.version_info[:2], version)) Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set. at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:452) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:588) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:571) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:406) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.lang.Thread.run(Thread.java:835) 19/04/25 19:20:02 WARN TaskSetManager: Lost task 3.2 in stage 0.0 (TID 10, 172.18.0.3, executor 0): TaskKilled (Stage cancelled) 19/04/25 19:20:02 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 19/04/25 19:20:02 INFO SparkContext: Invoking stop() from shutdown hook 19/04/25 19:20:02 INFO SparkUI: Stopped Spark web UI at http://10.254.17.9:4040 19/04/25 19:20:02 INFO StandaloneSchedulerBackend: Shutting down all executors 19/04/25 19:20:02 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 19/04/25 19:20:02 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 19/04/25 19:20:02 INFO MemoryStore: MemoryStore cleared 19/04/25 19:20:02 INFO BlockManager: BlockManager stopped 19/04/25 19:20:02 INFO BlockManagerMaster: BlockManagerMaster stopped 19/04/25 19:20:02 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 19/04/25 19:20:02 INFO SparkContext: Successfully stopped SparkContext 19/04/25 19:20:02 INFO ShutdownHookManager: Shutdown hook called 19/04/25 19:20:02 INFO ShutdownHookManager: Deleting directory /private/var/folders/bf/62kzgm4n4sx1x1wk6vjpwm6ndrr_bx/T/spark-26864dbb-9c33-4058-9885-d8b28997591c 19/04/25 19:20:02 INFO ShutdownHookManager: Deleting directory /private/var/folders/bf/62kzgm4n4sx1x1wk6vjpwm6ndrr_bx/T/spark-873066ef-4a89-4dea-916b-a7f1e16d6721 19/04/25 19:20:02 INFO ShutdownHookManager: Deleting directory /private/var/folders/bf/62kzgm4n4sx1x1wk6vjpwm6ndrr_bx/T/spark-26864dbb-9c33-4058-9885-d8b28997591c/pyspark-e3d85999-c979-4504-9d1b-dcd5a17f3b42 ➜ spark-2.4.1-bin-hadoop2.7

Possible cause: python version mismatch from driver/worker

fernandezpablo85 commented 5 years ago

The difference was between my own python version (running the driver) and the master/worker. Sorry for the confusion.