AllanChain / blog

Blogging in GitHub issues. Building with Astro.
https://allanchain.github.io/blog/
MIT License
13 stars 0 forks source link

Common Problems Setting up Traefik ACME #192

Open AllanChain opened 2 years ago

AllanChain commented 2 years ago

View Post on Blog

traefik-acme

Here're problems and solutions for Traefik ACME, including failing to connect to Let's Encrypt and no NS records. And there's an example of Traefik ACME configuration.


Background

I'm switching from Nginx Proxy Manager to Traefik for my home server. That's because Nginx Proxy Manager is buggy. The biggest problem is that if a service is down before Nginx is started, Nginx will never start up. And the inconvenience of managing rules for services is also annoying. So I'm trying out Traefik, the more docker-native one.

Common Problems

I actually encountered many problems when setting up Traefik and ACME. Here are the problems and my solutions.

Dial 127.0.0.11:53 time out

Traefik can't connect to Let's Encrypt and keeps complaining Dial 127.0.0.11:53 time out. I'm confused because the containers I created before have no problem accessing the Internet. I tried many solutions and found this one the most helpful: reboot. Oh yeah. After all, rebooting fixes 90 percent of user computer problems.

Connection to Let's Encrypt is unstable

After solving the "dial time out" error. I found that the network connection to Let's Encrypt is unstable. I randomly got timeouts and connection resets. But I have no problem accessing Let's Encrypt on the host. That turns out to be an IPv4 and IPv6 problem. You can try these on the host:

curl -4v https://acme-v02.api.letsencrypt.org/directory
curl -6v https://acme-v02.api.letsencrypt.org/directory

If IPv6 works fine but IPv4 got timeout for connection reset, you are having the same problem as mine. To fix this, we need to add IPv6 to the Traefik docker container and set the hosts via

    extra_hosts:
      - "acme-staging-v02.api.letsencrypt.org:2606:4700:60:0:f41b:d4fe:4325:6026"
      - "acme-v02.api.letsencrypt.org:2606:4700:60:0:f53d:5624:85c7:3a2c"

Since my ISP is constantly changing the IPv6 prefix, providing a fixed CIDR is impossible. Therefore, I chose Docker with IPv6 NAT and created a new network by

docker network create --ipv6 --subnet fd00:dead:beef::/48 nat6

And added Traefik to this network:

    networks:
      - nat6

networks:
  nat6:
    external: true

You are free to try out the official way to enable IPv6 in Docker: Enable IPv6 support | Docker Documentation

Host lost IPv6 connectivity

After following docker-ipv6nat's documentation, I found that the host couldn't reach any other IPv6 hosts as soon as I restarted the Docker daemon to enable IPv6. I had to disable IPv6 for Docker and reboot the machine.

The problem can be fixed by adding these lines to /etc/sysctl.conf, as described in the troubleshooting section

net.ipv6.conf.eth0.accept_ra = 2
net.ipv6.conf.all.forwarding = 1
net.ipv6.conf.default.forwarding = 1

Could not determine authoritative nameservers

Finally, I can connect to Let's Encrypt without issue. But there was another problem. Traefik complains:

could not determine authoritative nameservers

That's strange. I tried to dig my domain and found that there is no answer for the NS record and there is only an SOA record.

dig NS my.doma.in
;; QUESTION SECTION:
;my.doma.in.                    IN      NS

;; AUTHORITY SECTION:
my.doma.in.             180     IN      SOA    PROVIDER INFORMATION

On the one hand, there might be some problems with my DNS provider. On the other hand, LEGO fails to recognize the SOA record. The workaround is to disable DNS checking before notifying Let's Encrypt that we're ready:

    command:
      - "--certificatesresolvers.certls.acme.dnschallenge.DisablePropagationCheck=true"

Final Compose File

version: '3.7'

services:
  traefik:
    image: traefik:latest
    container_name: traefik
    command:
      # - "--log.level=DEBUG"
      - "--api.dashboard=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.docker.httpClientTimeout=300"
      - "--entrypoints.web-secure.address=:443"
      - "--serverstransport.insecureskipverify=true"
      - "--certificatesresolvers.certls.acme.dnschallenge=true"
      - "--certificatesresolvers.certls.acme.dnschallenge.provider=YOUR_PROVIDER"
      - "--certificatesresolvers.certls.acme.dnschallenge.delaybeforecheck=100"
      - "--certificatesresolvers.certls.acme.dnschallenge.DisablePropagationCheck=true"
      # Use staging server to avoid hitting rate limit
      # - "--certificatesresolvers.certls.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "--certificatesresolvers.certls.acme.email=${ACME_EMAIL}"
      - "--certificatesresolvers.certls.acme.storage=acme.json"
    environment:
      # Your DNS provider secrets
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=Host(`traefik.${DOMAIN}`)"
      - "traefik.http.routers.api.entrypoints=web-secure"
      - "traefik.http.routers.api.tls.certresolver=certls"
      - "traefik.http.routers.api.tls.domains[0].main=*.${DOMAIN}"
      - "traefik.http.routers.api.service=api@internal"
    ports:
      - 443:443
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /path/to/acme.json:/acme.json
    restart: always
    networks:
      - nat6
    extra_hosts:
      - "acme-staging-v02.api.letsencrypt.org:2606:4700:60:0:f41b:d4fe:4325:6026"
      - "acme-v02.api.letsencrypt.org:2606:4700:60:0:f53d:5624:85c7:3a2c"

networks:
  nat6:
    external: true