openzim / zimfarm

Farm operated by bots to grow and harvest new zim files
https://farm.openzim.org
GNU General Public License v3.0
84 stars 25 forks source link

Proposed Docker Compose setup for workers (with support for relaying traffic through a static IP w/ Wireguard) #488

Open pirate opened 4 years ago

pirate commented 4 years ago

I've been working on a solution for my home-based Zimfarm host (with a dynamic IP) to connect to a VPS server with a static IP. It uses wireguard to tunnel the traffic of a container through a Wireguard VPN host on a remote server.

This is the docker setup on the "client" (zimfarm worker): docker-compose.yml

version: '3'

services:
  wireguard:
    image: linuxserver/wireguard
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    volumes:
      - /lib/modules:/lib/modules
      - ./wg0.conf:/config/wg0.conf:ro

  zimfarm:
    image: ghcr.io/openzim/zimfarm
    network_mode: 'service:wireguard'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./data/zimfarm:/data

wg0.conf:

[Interface]
# Name = myWorkerName.wg.openzim.org
Address = 10.17.17.2/32
PrivateKey = YCW76edD4W7nZrPbWZxPZhcs32CsBLIi1sEhsV/sgk8=
DNS = 1.1.1.1,8.8.8.8

[Peer]
# Name = relay.wg.openzim.org
Endpoint = relay.wg.openzim.org:51820
PublicKey = zJNKewtL3gcHdG62V3GaBkErFtapJWsAx+2um0c0B1s=
AllowedIPs = 10.17.17.1/24,0.0.0.0/0
PersistentKeepalive = 21

(the VPN server side config is a very simple, bog-standard Wireguard server, so I'll omit it here, this issue only concerns the zimfarm client/worker setup)

This is the easiest way to run a container's internet traffic through wireguard, though there are some other more difficult ways involving modifying IPtables on the host (which I'd rather not do on my machine, and is also difficult because the containers are spawned dynamically so DHCP+wg-dynamic or some other solution must be used to give each container an IP). https://github.com/pirate/wireguard-docs/blob/master/README.md#Containerization

The issue is that zimfarm doesn't work as a single container, it instead takes control of docker on the host machine to spawn multiple other containers, which means it's more difficult to get it to run all those containers through wireguard.

Possible solutions:

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

pirate commented 3 years ago

How much transfer per month is typical for a zimfarm worker? Would it exceed 1TB/mo?

I ask because I'm going to set up a VPN bounce machine to try and get my zimfarm rack server up again, and am wondering how much to budget for bandwidth.

kelson42 commented 3 years ago

@pirate Good news, but question hard if not impossible to answer. It depends... but if it ever goes over 1TB, it should really not be from that much.

rgaudin commented 3 years ago

Looked at the AWS-hosted worker we had and the last bill says:

Item Value
data transfer in per month 1,019.211 GB
first 1 GB of data transferred out per month 0.949 GB
regional data transfer - in/out/between EC2 AZs or using elastic IPs or ELB 0.116 GB
first 10 TB / month data transfer out beyond the global free tier 269.521 GB
rgaudin commented 3 years ago

This dates back from August.

pirate commented 3 years ago

Ok, I was thinking of using DigitalOcean which is 1TB for free, with $0.02 USD/GB after that. Hoping it's not too much higher because then it goes from a $5/mo project to a $20+/mo project. I will keep looking at alternative hosting providers, maybe I can find one with free bandwidth up to 2TB.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

pirate commented 3 years ago

Do you know if zimfarm-worker-manager able to manage spawning other containers (using /var/run/docker.sock) headlessly over months with minimal setup? i.e. can I run it in compose like this:

  zimfarm:
    image: openzim/zimfarm-worker-manager
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

I'm about to publish a new repo called the https://github.com/pirate/good-karma-kit to run on servers with spare CPU/RAM/bandwidth and I think it could help get you a decent number of people running this.

Ideally I'd like to make it as simple/one-click as possible, but even considering the >1TB bandwidth, CPU, docker.sock access, and fixed IP requirements I bet we can get you a few good zimfarm worker contributors.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

kelson42 commented 2 years ago

@pirate We have implemented a support for dynamic IP in #659. Would that allow you to give another try?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.