NBISweden / LocalEGA

Please go to to https://github.com/EGA-archive/LocalEGA instead
Apache License 2.0
4 stars 1 forks source link

Docker Swarm deployment #326

Closed dtitov closed 6 years ago

dtitov commented 6 years ago

Here we go. I'm going to record a video for the whole process tomorrow, but there are already instructions available in README.md, so the review can be started.

dtitov commented 6 years ago

@silverdaz, yes, it worked. Done.

blankdots commented 6 years ago

For starting a swarm cluster (using virutalbox) one can use (don't consider it the best option):

#!/usr/bin/env bash

case $1 in
  start)
    begin=$(date +%s)
    for ((i=1;i<=$2;i+=1)); do
      docker-machine create -d virtualbox --virtualbox-memory "4096" lega-swarm-$i
    done

    eval $(docker-machine env lega-swarm-1)

    docker swarm init \
      --advertise-addr $(docker-machine ip lega-swarm-1)

    TOKEN=$(docker swarm join-token -q worker)

    if [ $2 -gt 1 ]
    then
      for ((i=2;i<=$2;i+=1)); do
        eval $(docker-machine env lega-swarm-$i)

        docker swarm join \
          --token $TOKEN \
          --advertise-addr $(docker-machine ip lega-swarm-$i) \
          $(docker-machine ip lega-swarm-1):2377
      done
    fi
    end=$(date +%s)

    echo ">> The Docker Machine Swarm is up and running"

    tottime=$(expr $end - $begin)
    echo ">> It took $tottime seconds to start the LEGA Swarm."
    ;;
  clean)
    for ((i=1;i<=$2;i+=1)); do
      docker-machine rm lega-swarm-$i -y
    done
    ;;
esac

Run like: ./script.sh start 2 or ./script.sh clean 2 To set the terminal to redirect commands to swarm master eval $(docker-machine env attx-swarm-1), after which the gradle commands can be run. To unset the terminal eval $(docker-machine env --unset).

blankdots commented 6 years ago

I was able to start the stack, however the ingest task fails, seems that it cannot connect to:

╰─$ gradle ingest
> Task :upload FAILED

FAILURE: Build failed with an exception.

* Where:
Script '/home/stenegru/csc/dev/LocalEGA/deployments/swarm/utils.gradle' line: 422

* What went wrong:
Execution failed for task ':upload'.
> java.net.ConnectException: Connection refused (Connection refused)

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 0s
3 actionable tasks: 3 executed

After changing localhost to the docker-machine ip lega-swarm-1 (I am starting with one node in this setup) on lines 422, 432, 455 the ingest works. Please don't hardcode localhost but instead detect the ip where each service resides.

When running with 3 nodes and with the proper IPs of the services it seems to fail:

╰─$ gradle ingest
> Task :ingest FAILED

FAILURE: Build failed with an exception.

* Where:
Build file '/home/stenegru/csc/dev/LocalEGA/deployments/swarm/build.gradle' line: 116

* What went wrong:
Execution failed for task ':ingest'.
> File was not ingested!

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 9s
4 actionable tasks: 4 executed
dtitov commented 6 years ago

@blankdots, you are going a little bit beyond of this PR 😄

  1. The current version is not ready for Docker Machine. I was going to add support for Docker Machine in the next PR along with functionality to deploy Swarm to OpenStack (or locally via VMs just as you did).
  2. To test it locally you don't have to use Docker Machine: just do docker swarm init on your host machine and you're ready to go. It will create a cluster containing of one manager node.
  3. Hardcoding localhost means that the tool is meant to run on the machine where the manager node is residing. And I'm not sure if it's legit to detect IP via Docker Machine, because someone's Swarm may be created not using Docker Machine, but manually, for instance. Again: at the moment this tool is not aware of managing the cluster itself. It just assumes that it is running on the machine with the manager node.
  4. I'm not sure what happened in the last scenario with three nodes, it will need investigation. But I would prefer to postpone it till the next PR.
blankdots commented 6 years ago

@dtitov Ok, understandable. That is how I set up a swarm cluster for testing, and given that:

Also this tool doesn't (at least yet) create a Swarm cluster for you, so one needs to have it beforehand.

I did just that, created my swarm cluster :smile: Anyway, with one node it seems to work. So just make the changes for the custom task, and/or custom plugin and we are good to go.

dtitov commented 6 years ago

Now it also works in a multi-node cluster created by Docker Machine:

The distribution of services across nodes are random though, as I didn't introduce any deployment constraints yet.