Hydrospheredata / mist

Serverless proxy for Spark cluster
http://hydrosphere.io/mist/
Apache License 2.0
326 stars 68 forks source link

Docker mode networking problem #415

Closed dos65 closed 6 years ago

dos65 commented 6 years ago

By default in docker distributive, we use docker-mode for worker spawning. But it's impossible to connect them to remote spark cluster (it works only if spark master is on the same host with docker). It looks that we should use host networking for containers where we run workers or at least we should mention that problem in getting started guide

austinnichols101 commented 6 years ago

An additional problem is that the documentation on https://hub.docker.com/r/hydrosphere/mist/ specifies port 2003 but the docker image is using 2004 internally for the UI.

I have been attempting to get the docker image to work with mist and the spark-master both running as containers on a single physical host. So far I have not been able to get mist to show any clusters under the clusters tab. (For example, using the https://github.com/big-data-europe/docker-spark) image).

dos65 commented 6 years ago

@austinnichols101 thanks for pointing on docker hub readme, we completely forgot about them.

About docker-spark - it looks than adding mist image to docker-compose.yml should be enough. Did you try to do that in such way?

austinnichols101 commented 6 years ago

@dos65 - I've tried several ways:

In all three cases I am able to start mist but I never see anything listed under clusters. I apologize as I'm obviously hijacking the thread and will be glad to open a separate issue (please let me know if there is specific info you need).

version: "2.2"
services:
  spark-master:
    image: bde2020/spark-master:2.2.0-hadoop2.7
    container_name: spark-master
    ports:
      - 8080:8080
      - 7077:7077
    environment:
      - ENABLE_INIT_DAEMON=false
  spark-worker-1:
    image: bde2020/spark-worker:2.2.0-hadoop2.7
    container_name: spark-worker-1
    depends_on:
      - spark-master
    ports:
      - 8081:8081
    environment:
      - SPARK_MASTER=spark://spark-master:7077
  mist:
    image: hydrosphere/mist:1.0.0-RC9-2.2.0
    container_name: mist
    depends_on:
      - spark-master
    ports:
      - 2004:2004
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    command: ["mist"]
blvp commented 6 years ago

Cluster tab in mist shows you Mist Workers. Mist Worker - Spark driver application that actually invokes a function (Mist Master automatically spawns its Workers). So after you run your function you'll see a worker with the name of function's defaultContext.

dos65 commented 6 years ago

Also docker-entrypoint.sh doesn't expect that mist can be run in --net=host

Could not find or load main class 172.17.0.1