big-data-europe / docker-spark

Apache Spark docker image
2.04k stars 698 forks source link

java.lang.NumberFormatException: For input string: "tcp://10.153.36.170:8080" #128

Open geekyouth opened 3 years ago

geekyouth commented 3 years ago

image

2021-06-26 00:57:39.41 UTCspark-master-fdb5b47df-p85lhspark-masterUsing Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
2021-06-26 00:57:39.48 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:39 INFO Master: Started daemon with process name: 9@spark-master-fdb5b47df-p85lh
2021-06-26 00:57:39.58 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:39 INFO SignalUtils: Registering signal handler for TERM
2021-06-26 00:57:39.58 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:39 INFO SignalUtils: Registering signal handler for HUP
2021-06-26 00:57:39.58 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:39 INFO SignalUtils: Registering signal handler for INT
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:40 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-masterjava.lang.NumberFormatException: For input string: "tcp://10.153.36.170:8080"
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at java.lang.Integer.parseInt(Integer.java:580)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at java.lang.Integer.parseInt(Integer.java:615)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at scala.collection.immutable.StringLike.toInt(StringLike.scala:304)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at scala.collection.immutable.StringLike.toInt$(StringLike.scala:304)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at scala.collection.immutable.StringOps.toInt(StringOps.scala:33)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:46)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at org.apache.spark.deploy.master.Master$.main(Master.scala:1208)
2021-06-26 00:57:40.03 UTCspark-master-fdb5b47df-p85lhspark-master at org.apache.spark.deploy.master.Master.main(Master.scala)
2021-06-26 00:57:40.08 UTCspark-master-fdb5b47df-p85lhspark-master21/06/26 00:57:40 INFO ShutdownHookManager: Shutdown hook called
geekyouth commented 3 years ago

yaml: https://raw.githubusercontent.com/big-data-europe/docker-spark/2.4.3-hadoop2.7/docker-compose.yml


spark-master:
  image: bde2020/spark-master:2.4.3-hadoop2.7
  container_name: spark-master
  ports:
    - "8080:8080"
    - "7077:7077"
  environment:
    - INIT_DAEMON_STEP=setup_spark
    - "constraint:node==<yourmasternode>"
spark-worker-1:
  image: bde2020/spark-worker:2.4.3-hadoop2.7
  container_name: spark-worker-1
  depends_on:
    - spark-master
  ports:
    - "8081:8081"
  environment:
    - "SPARK_MASTER=spark://spark-master:7077"
    - "constraint:node==<yourworkernode>"
spark-worker-2:
  image: bde2020/spark-worker:2.4.3-hadoop2.7
  container_name: spark-worker-2
  depends_on:
    - spark-master
  ports:
    - "8081:8081"
  environment:
    - "SPARK_MASTER=spark://spark-master:7077"
    - "constraint:node==<yourworkernode>"  
Eduardo-Barbieri commented 2 years ago

I am having the exact same problem on deploy, have any tip?

bdezonia commented 2 years ago

For me, I ran into this problem when I was putting these spark images on kubernetes. This article explains what the problem is: https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c.

TLDR: kubernetes autohatches environmental variables that tromp on Spark's values and this confuses Spark. For kubernetes never name your spark master as spark-master. Name it something else (for instance, to workaround this I temporarily named mine as death-star and things started working).

bdezonia commented 2 years ago

Minor unrelated comment: why do you have two workers listening on 8081? In my setups I put worker 1 on 8081 and worker 2 on 8082 etc. Maybe my approach is wrong but it's been working well for me.

devYonz commented 1 year ago

For me, I ran into this problem when I was putting these spark images on kubernetes. This article explains what the problem is: https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c.

TLDR: kubernetes autohatches environmental variables that tromp on Spark's values and this confuses Spark. For kubernetes never name your spark master as spark-master. Name it something else (for instance, to workaround this I temporarily named mine as death-star and things started working).

UnF****ing believable, this fixed it. Cheers! I can go to bed now. Note the same applies to setting workers to spark-worker started getting the same errors on my workers then it was resolved when i moved to spark-workerbee

For the next poor chap, this was the output of my spark-master error:

23/07/22 05:05:56 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
java.lang.NumberFormatException: For input string: "tcp://10.105.207.146:7077"
    at java.base/java.lang.NumberFormatException.forInputString(Unknown Source)
    at java.base/java.lang.Integer.parseInt(Unknown Source)
    at java.base/java.lang.Integer.parseInt(Unknown Source)
    at scala.collection.immutable.StringLike.toInt(StringLike.scala:304)
    at scala.collection.immutable.StringLike.toInt$(StringLike.scala:304)
    at scala.collection.immutable.StringOps.toInt(StringOps.scala:33)
    at org.apache.spark.deploy.master.MasterArguments.<init>(MasterArguments.scala:46)
    at org.apache.spark.deploy.master.Master$.main(Master.scala:1228)
    at org.apache.spark.deploy.master.Master.main(Master.scala)
23/07/22 05:05:56 INFO ShutdownHookManager: Shutdown hook called

From the blog https://medium.com/@varunreddydaaram/kubernetes-did-not-work-with-apache-spark-de923ae7ab5c. linked by @bdezonia

So, for our service spark-master, kubernetes would generate an env varibale called SPARK_MASTER_PORT=tcp://100.68.168.187:8080, but in turn SPARK_MASTER_PORT was an internal variable for APACHE SPARK! It worked for service spark-hdfs, because kubernetes would generate an env variable called SPARK_HDFS_PORT=tcp://100.68.168.187:8080, which,… is not referenced by APACHE SPARK, so it worked!

pedro-canedo commented 9 months ago

Man, I don't know if I should be mad or happy lol, but this worked well when changing the application's name, thanks. n my case, I changed the naming from 'spark-master' to a StatefulSet with the name 'spark-controller-statefulset'

natan-dias commented 7 months ago

It really helped, thanks. But now I got this error, even with my service using port 8080:

24/03/22 19:32:01 INFO Utils: Successfully started service 'sparkMaster' on port 7077. 24/03/22 19:32:01 INFO Master: Starting Spark master at spark://spark-master-deploy-688cbd4575-qjtlg:7077 24/03/22 19:32:01 INFO Master: Running Spark version 3.5.1 24/03/22 19:32:01 INFO JettyUtils: Start Jetty spark-master-deploy:8080 for MasterUI 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8080. Attempting port 8081. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8081. Attempting port 8082. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8082. Attempting port 8083. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8083. Attempting port 8084. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8084. Attempting port 8085. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8085. Attempting port 8086. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8086. Attempting port 8087. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8087. Attempting port 8088. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8088. Attempting port 8089. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8089. Attempting port 8090. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8090. Attempting port 8091. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8091. Attempting port 8092. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8092. Attempting port 8093. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8093. Attempting port 8094. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8094. Attempting port 8095. 24/03/22 19:32:01 WARN Utils: Service 'MasterUI' could not bind on port 8095. Attempting port 8096. 24/03/22 19:32:01 ERROR MasterWebUI: Failed to bind MasterWebUI

bdezonia commented 7 months ago

TLDR of the TLDR's: name your spark master as gru and your spark workers as minion1 / minion2 / etc.

natan-dias commented 7 months ago

Ok, but the spark master is up. the problem is with the MasterUI, that could not bind with the port 8080, even with a service created and this port exposed and open. Do you think the problem is also related with the name? My deployment names are spark-master-deploy and sparke-worker-deploy.

natan-dias commented 7 months ago

Another question: When you say "name your master as SOMETHING ELSE" is the name of the container in kubernetes or the name of the master in spark config???

bdezonia commented 7 months ago

Natan, I have not worked in kubernetes since shortly after my original reply about 2 years ago. I truly don't remember. Why don't you experiment with both sides. Try one side (spark) and see how it goes and then try the other side (kubernetes) and observe as well. And change names of master and all workers. Name them nonsense so you know no environmental variables should be getting tromped.

bdezonia commented 7 months ago

Just noting the example above it looks like modifying the name in docker-compose.yml is the right thing to do.

natan-dias commented 7 months ago

Yeah, definitely is not working. I tried everything. My problem is not with docker-compose, but with kubernetes. Using compose is going fine, but when I try the very same deployment with kubernetes I get the error when spar tries o bind MasterWebUI to port 8080. Even changing the name of the service, the name of the container, deployment, environment variables. Everything! Nothing works!

bdezonia commented 7 months ago

If you want to eliminate that this ticket issue is not the problem then start logging into your nodes remotely and inspect all the defined environmental variables and see if you see anything problematic.

natan-dias commented 7 months ago

Ok, after struggling a lot, I fixed the issue. In fact, for kubernetes deployment, if you name the service as spark-master and/or spark-worker, it will never work. And also, I was using a env named as SPARK_LOCAL_IP, which was messing with everything. After removing this env and naming my container and service as spark-main and spark-node, it works!

I will put a link to my git repo as soon as I finish the tests. Thanks everyone!