Open geekyouth opened 3 years ago
yaml: https://raw.githubusercontent.com/big-data-europe/docker-spark/2.4.3-hadoop2.7/docker-compose.yml
spark-master:
image: bde2020/spark-master:2.4.3-hadoop2.7
container_name: spark-master
ports:
- "8080:8080"
- "7077:7077"
environment:
- INIT_DAEMON_STEP=setup_spark
- "constraint:node==<yourmasternode>"
spark-worker-1:
image: bde2020/spark-worker:2.4.3-hadoop2.7
container_name: spark-worker-1
depends_on:
- spark-master
ports:
- "8081:8081"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
- "constraint:node==<yourworkernode>"
spark-worker-2:
image: bde2020/spark-worker:2.4.3-hadoop2.7
container_name: spark-worker-2
depends_on:
- spark-master
ports:
- "8081:8081"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
- "constraint:node==<yourworkernode>"
I am having the exact same problem on deploy, have any tip?
For me, I ran into this problem when I was putting these spark images on kubernetes. This article explains what the problem is: https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c.
TLDR: kubernetes autohatches environmental variables that tromp on Spark's values and this confuses Spark. For kubernetes never name your spark master as spark-master. Name it something else (for instance, to workaround this I temporarily named mine as death-star and things started working).
Minor unrelated comment: why do you have two workers listening on 8081? In my setups I put worker 1 on 8081 and worker 2 on 8082 etc. Maybe my approach is wrong but it's been working well for me.
For me, I ran into this problem when I was putting these spark images on kubernetes. This article explains what the problem is: https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c.
TLDR: kubernetes autohatches environmental variables that tromp on Spark's values and this confuses Spark. For kubernetes never name your spark master as spark-master. Name it something else (for instance, to workaround this I temporarily named mine as death-star and things started working).
UnF****ing believable, this fixed it. Cheers! I can go to bed now. Note the same applies to setting workers to spark-worker
started getting the same errors on my workers then it was resolved when i moved to spark-workerbee
For the next poor chap, this was the output of my spark-master error:
23/07/22 05:05:56 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
java.lang.NumberFormatException: For input string: "tcp://10.105.207.146:7077"
at java.base/java.lang.NumberFormatException.forInputString(Unknown Source)
at java.base/java.lang.Integer.parseInt(Unknown Source)
at java.base/java.lang.Integer.parseInt(Unknown Source)
at scala.collection.immutable.StringLike.toInt(StringLike.scala:304)
at scala.collection.immutable.StringLike.toInt$(StringLike.scala:304)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:33)
at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:46)
at org.apache.spark.deploy.master.Master$.main(Master.scala:1228)
at org.apache.spark.deploy.master.Master.main(Master.scala)
23/07/22 05:05:56 INFO ShutdownHookManager: Shutdown hook called
From the blog https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c. linked by @bdezonia
So, for our service spark-master, kubernetes would generate an env varibale called SPARK_MASTER_PORT=tcp://100.68.168.187:8080, but in turn SPARK_MASTER_PORT was an internal variable for APACHE SPARK! It worked for service spark-hdfs, because kubernetes would generate an env variable called SPARK_HDFS_PORT=tcp://100.68.168.187:8080, which,… is not referenced by APACHE SPARK, so it worked!
Man, I don't know if I should be mad or happy lol, but this worked well when changing the application's name, thanks. n my case, I changed the naming from 'spark-master' to a StatefulSet with the name 'spark-controller-statefulset'
It really helped, thanks. But now I got this error, even with my service using port 8080:
24/03/22 19:32:01 INFO Utils: Successfully started service 'sparkMaster' on port 7077. 24/03/22 19:32:01 INFO Master: Starting Spark master at spark://spark-master-deploy-688cbd4575-qjtlg:7077 24/03/22 19:32:01 INFO Master: Running Spark version 3.5.1 24/03/22 19:32:01 INFO JettyUtils: Start Jetty spark-master-deploy:8080 for MasterUI 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8080. Attempting port 8081. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8081. Attempting port 8082. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8082. Attempting port 8083. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8083. Attempting port 8084. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8084. Attempting port 8085. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8085. Attempting port 8086. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8086. Attempting port 8087. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8087. Attempting port 8088. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8088. Attempting port 8089. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8089. Attempting port 8090. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8090. Attempting port 8091. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8091. Attempting port 8092. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8092. Attempting port 8093. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8093. Attempting port 8094. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8094. Attempting port 8095. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8095. Attempting port 8096. 24/03/22 19:32:01 ERROR MasterWebUI: Failed to bind MasterWebUI
TLDR of the TLDR's: name your spark master as gru and your spark workers as minion1 / minion2 / etc.
Ok, but the spark master is up. the problem is with the MasterUI, that could not bind with the port 8080, even with a service created and this port exposed and open. Do you think the problem is also related with the name? My deployment names are spark-master-deploy and sparke-worker-deploy.
Another question: When you say "name your master as SOMETHING ELSE" is the name of the container in kubernetes or the name of the master in spark config???
Natan, I have not worked in kubernetes since shortly after my original reply about 2 years ago. I truly don't remember. Why don't you experiment with both sides. Try one side (spark) and see how it goes and then try the other side (kubernetes) and observe as well. And change names of master and all workers. Name them nonsense so you know no environmental variables should be getting tromped.
Just noting the example above it looks like modifying the name in docker-compose.yml is the right thing to do.
Yeah, definitely is not working. I tried everything. My problem is not with docker-compose, but with kubernetes. Using compose is going fine, but when I try the very same deployment with kubernetes I get the error when spar tries o bind MasterWebUI to port 8080. Even changing the name of the service, the name of the container, deployment, environment variables. Everything! Nothing works!
If you want to eliminate that this ticket issue is not the problem then start logging into your nodes remotely and inspect all the defined environmental variables and see if you see anything problematic.
Ok, after struggling a lot, I fixed the issue. In fact, for kubernetes deployment, if you name the service as spark-master and/or spark-worker, it will never work. And also, I was using a env named as SPARK_LOCAL_IP, which was messing with everything. After removing this env and naming my container and service as spark-main and spark-node, it works!
I will put a link to my git repo as soon as I finish the tests. Thanks everyone!