big-data-europe / docker-hadoop-spark-workbench

[EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook and HDFS FileBrowser.
689 stars 374 forks source link

docker-compose scale spark-worker=3 & spark-submit #38

Closed Vzzarr closed 6 years ago

Vzzarr commented 6 years ago

according to this tutorial that you posted in the main README.md of this repository, I first followed the instructions provided in the README.md docker-compose up -d and it worked; but then when I try to use docker-compose scale spark-worker=3 I receive the error: ERROR: for dockerhadoopsparkworkbench_spark-worker_2 driver failed programming external connectivity on endpoint dockerhadoopsparkworkbench_spark-worker_2 (079d21c97e12d288aea5246c5eb575f245161c330639fcea35c899056a2e8af2): Bind for 0.0.0.0:8081 failed: port is already allocated and the same error even with the worker-3. This because the port is already used by the first worker allocated with docker-compose up -d, is it an issue of the code or I'm making something wrong (I'm new to docker, sorry)?

And moreover, when I try to use spark-submit: /usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-submit --class uk.ac.ncl.NGS_SparkGATK.Pipeline --master local[*] NGS-SparkGATK.jar HelloWorld It works, while if as master I use spark://spark-master:7077 as suggested in the tutorial reported in the README.md I receive the error Failed to connect to master spark-master:7077 Which is the correct IP address to use in order to submit the spark job?

I hope to be clear in explaining my problem, I'm waiting for your kind response

earthquakesan commented 6 years ago

Hi @Vzzarr!

Thanks for pointing it out, I need to update the docs and provide an example setup for running Spark application with the workbench. Will update you when I am done.

earthquakesan commented 6 years ago

@Vzzarr, new distributed setup is available here. If you have any questions, feel free to open an issue.