big-data-europe / docker-spark

Apache Spark docker image
2.04k stars 698 forks source link

Master Rest API #60

Open alexfdezsauco opened 5 years ago

alexfdezsauco commented 5 years ago

There is a way to enable the rest API via environment variable. The master image expose the 6066 port but there is nothing listening on this port. Looks like the rest API disabled by default.

Any idea?

bpiper commented 5 years ago

+1 for this. It would be great if we could add arbitrary Java options to the command in master.sh via an env variable.

atomobianco commented 5 years ago

+1 for this.

Indeed, nothing listening on port 6066

from a spark 2.3.2 container

root@01a1f568c397:/# netstat -apn | grep LISTEN
tcp        0      0 172.18.0.4:7077         0.0.0.0:*               LISTEN      100/java            
tcp        0      0 127.0.0.11:35371        0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      100/java            
tcp        0      0 172.18.0.4:6066         0.0.0.0:*               LISTEN      100/java            

from a spark 2.4.0 container

root@39c835645153:/# netstat -apn | grep LISTEN
tcp        0      0 172.18.0.4:7077         0.0.0.0:*               LISTEN      100/java            
tcp        0      0 127.0.0.11:42479        0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      100/java  
spicoflorin commented 5 years ago

+1 for this. I modified the master container /spark/conf/spark-defaults.conf by adding
the line spark.master.rest.enabled true in order to enable the REST API server. I have restarted the container, but it seems that the configuration is not taken into consideration by the master.sh, meaning the REST server is not started.

In socker-compose, I have atached a volume to master container with the above file configuration and start again the container, with no effect. Therefore, is strange for me, why the spark-master does not take into the consideration the /spark/conf/spark-deafults.conf as it does usually.

spicoflorin commented 5 years ago

Solved how to start the REST API server. Add the SPARK_CONF_DIR=/spark/conf enviornment variable, set up spark.master.rest.enabled true in spark-defaults.conf , create volume : - ./spark/conf:/spark/conf

spark-master:

    image: bde2020/spark-master:2.4.1-hadoop2.7

    container_name: spark-master

    ports:

      - "12000:8080"

      - "7077:7077"

      - "6066:6066"

    environment:

     - INIT_DAEMON_STEP=setup_spark
   -  SPARK_CONF_DIR=/spark/conf

    volumes:

      - ./spark/conf:/spark/conf