mother-of-all-self-hosting / mash-playbook

🐋 Ansible playbook which helps you host various FOSS services as Docker containers on your own server
GNU Affero General Public License v3.0
486 stars 64 forks source link

Consider using smaller container networks to avoid exhausting the networks pool #139

Open spantaleev opened 10 months ago

spantaleev commented 10 months ago

Ref: https://straz.to/2021-09-08-docker-address-pools/

This is a nice post. It's a bit intrusive for the playbook to be changing the default address pool.

Although.. I guess that if the playbook is managing your Docker installation (which is optional and some may decide to turn it off), it may as well do some reconfiguration as it sees fit.

Then the question is.. how do we do this nicely? It seems like the ansible-role-docker role we're currently using has some docker_daemon_options variable, which influences /etc/docker/daemon.json.

So.. we may be able to set some options.

I suppose networks that had already been created will not be affected.. they'd need to be recreated to become small.

Or worse yet.. the new Docker pool definition (with the tiny networks) may be in conflict with the existing large ones.. and we'd need to force-delete them first and recreate them.

There are many things to figure out, but.. it seems like it's a possibility we should research.

QEDeD commented 10 months ago

I'm definitively a fan of making the networks and scopes more elegant and configurable

QEDeD commented 1 month ago

As I ran out of subnets, I got around to looking at how this can be handled. Per default, docker seems to use the private subnets 192.168.0.0/16 and 172.16.0.0/12, but not 10.0.0.0/8. They are subnet'ed into /16 for 172.16-networks and /20 for 192.168, resulting in 16 possible subnets for 192.168 and 172.16 each, givning you a total of 32, which is not particularly impressive.

The total number of available subnets/networks can easily be increased by lowering the size of each created subnet. Each network only needs a few IPs, for the context of MASH, the traefik network is probably the one with the higest requirement as it's connected to every network, but even with all of the currect 74 MASH services, a couple Matrix services and a few duplicates, I believe a /25 (128 IPs, 125 available for services including traefik) should be more than sufficient.

For a concrete example of utilization - calculated by ChatGPT based on output from the following commands:

docker network ls -q | xargs -n 1 docker network inspect -f '{{.Name}}: {{range .IPAM.Config}}{{.Subnet}}{{end}}'
network in $(docker network ls -q); do
  network_name=$(docker network inspect -f '{{.Name}}' $network)
  container_count=$(docker network inspect -f '{{len .Containers}}' $network)
  echo "Network: $network_name, Containers: $container_count"

Utilization

Network: authentik-keydb, Subnet: 192.168.240.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: bridge, Subnet: 172.17.0.0/16, Used IPs: 1, Total IPs: 65536, Utilization: 0.00%
Network: mash-adguard-home, Subnet: 192.168.80.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-collabora-online, Subnet: 192.168.0.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-firezone, Subnet: 192.168.32.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-freshrss, Subnet: 192.168.96.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-gitea, Subnet: 192.168.48.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-hubsite, Subnet: 192.168.112.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-miniflux, Subnet: 172.31.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: mash-nextcloud, Subnet: 192.168.16.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: mash-postgres, Subnet: 172.30.0.0/16, Used IPs: 5, Total IPs: 65536, Utilization: 0.01%
Network: mash-stirling-pdf, Subnet: 192.168.208.0/20, Used IPs: 1, Total IPs: 4096, Utilization: 0.02%
Network: matrix, Subnet: 172.18.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: matrix-addons, Subnet: 192.168.160.0/20, Used IPs: 5, Total IPs: 4096, Utilization: 0.12%
Network: matrix-client-element, Subnet: 172.24.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: matrix-container-socket-proxy, Subnet: 172.25.0.0/16, Used IPs: 2, Total IPs: 65536, Utilization: 0.00%
Network: matrix-coturn, Subnet: 172.19.0.0/16, Used IPs: 1, Total IPs: 65536, Utilization: 0.00%
Network: matrix-exim-relay, Subnet: 192.168.128.0/20, Used IPs: 2, Total IPs: 4096, Utilization: 0.05%
Network: matrix-grafana, Subnet: 172.23.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: matrix-homeserver, Subnet: 192.168.176.0/20, Used IPs: 3, Total IPs: 4096, Utilization: 0.07%
Network: matrix-monitoring, Subnet: 192.168.192.0/20, Used IPs: 4, Total IPs: 4096, Utilization: 0.10%
Network: matrix-postgres, Subnet: 192.168.144.0/20, Used IPs: 6, Total IPs: 4096, Utilization: 0.15%
Network: matrix-prometheus, Subnet: 172.29.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: matrix-redis, Subnet: 172.27.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: matrix-sliding-sync, Subnet: 192.168.64.0/20, Used IPs: 0, Total IPs: 4096, Utilization: 0.00%
Network: matrix-synapse-admin, Subnet: 172.22.0.0/16, Used IPs: 0, Total IPs: 65536, Utilization: 0.00%
Network: traefik, Subnet: 172.26.0.0/16, Used IPs: 20, Total IPs: 65536, Utilization: 0.03%

Anyway, by including the following configuration in the Docker section of the MASH vars.yml file, I was able to resize the networks to /25 and thereby have more subnets/docker networks available. Be aware this will restart the docker daemon and therefore all docker containers.

mash_playbook_docker_installation_daemon_options_custom: {
  "default-address-pools": [
    {
      "base": "172.16.0.0/12",
      "size": 25
    },
    {
      "base": "192.168.0.0/16",
      "size": 25
    }
  ]
}