flaviostutz / docker-swarm-cluster

Combines some general tooling for creating a good Docker Swarm Cluster (Swarm Dashboard, Traefik, Portainer, Prometheus, Grafana)
MIT License
78 stars 30 forks source link

docker-swarm-cluster

Combines some tooling for creating a good Docker Swarm Cluster.

Overview

HTTP(S) Ingress

Cluster Management

Metrics Monitoring

Installation

Ingress

  caddy-server:
    deploy:
      labels:
        - caddy.auto_https=off
        - caddy_controlled_server=
    ...
yourservice:
  ...
  deploy:
    placement:
        constraints:
          - node.role != manager
echo '{"log-driver": "journald", "metrics-addr" : "172.18.0.1:9323", "experimental" : true, "default-ulimits": { "memlock": { "Name": "memlock", "Hard": -1, "Soft": -1 }, "stack": { "Name": "stack", "Hard": -1, "Soft": -1 }} }' > /etc/docker/daemon.json
service docker restart

Security

Optimal elastic topology

If you need elasticity (need to grow or shrink server size depending on app traffic) a good topology would be to have some two cluster "sizes". One that we call "idle" that has the minimal sizing when few users are on, and a "hot" configuration when traffic is high.

For the "idle" state, we use:

For the "hot" state, we use:

HA practices

...
      placement:
        preferences:
          - spread: node.role
...

Service URLs

Services will be accessible by URLs: http://portainer.mycluster.org http://dashboard.mycluster.org http://grafana.mycluster.org http://unsee.mycluster.org http://alertmanager.mycluster.org http://prometheus.mycluster.org

Services which don't have embedded user name protection will use Caddy's basic auth. Change password accordingly. Defaults to admin/admin123admin123

The following services will have published ports on hosts so that you can use swarm network mesh to access admin service directly when Caddy is not accessible

So point your browser to any public IP of a member VM to this port and access the service

Common Operations

Force service rebalancing among nodes

# docker service ls -q > dkr_svcs && for i in `cat dkr_svcs`; do docker service update "$i" --detach=false --force ; done
for service in $(docker service ls -q); do docker service update --force $service; done

WARNING: User service disruption will happen while doing this as some containers will be stopped during this operation

Add a new VM to the cluster

Production tips

Optimal Topology

PLACE IMAGE HERE

OOM

Tricks

Customizations

  1. Change the desired compose file for specific cluster configurations
  2. Run create.sh for updating modified services

docker-compose files

TODO

Volume management

Logs aggregation

Metrics Monitoring

Cloud provider tips

Digital Ocean

apt-get install letsencrypt
certbot certonly --manual --preferred-challenges=dns --email=me@me.com --server https://acme-v02.api.letsencrypt.org/directory --agree-tos -d *.poc.me.com

VMs