AdrienPoupa / docker-compose-nas

Simple Docker Compose NAS featuring Sonarr, Radarr, Prowlarr, Jellyfin, qBittorrent, PIA VPN and Traefik with SSL support
990 stars 127 forks source link

LE certificates error "could not find the start of authority" #27

Closed lucyannofrota closed 1 year ago

lucyannofrota commented 1 year ago

I did the configuration as suggested in the repository except for the VPN. Everything seems to be working as expected, but I cannot get the SSL certificates to work.

I'm using cloudflare domain and DNS.

I'm getting this error in the traefik container:

level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/9999999999"
level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] acme: Could not find solver for: tls-alpn-01"
level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] acme: Could not find solver for: http-01"
level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] acme: use dns-01 solver"
level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] acme: Preparing to solve DNS-01"
level=debug msg="legolog: [INFO] [MySubDomain.MyDomain.com] acme: Cleaning DNS-01 challenge"
level=debug msg="legolog: [WARN] [MySubDomain.MyDomain.com] acme: cleaning up failed: cloudflare: could not find the start of authority for _acme-challenge.MySubDomain.MyDomain.com.: NOERROR "
level=debug msg="legolog: [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/9999999999"
level=debug msg="Serving default certificate for request: \"localhost\""
level=error msg="Unable to obtain ACME certificate for domains \"MySubDomain.MyDomain.com\": unable to generate a certificate for the domains [MySubDomain.MyDomain.com]: error: one or more domains had a problem:\n[MySubDomain.MyDomain.com] [MySubDomain.MyDomain.com] acme: error presenting token: cloudflare: could not find the start of authority for _acme-challenge.MySubDomain.MyDomain.com.: NOERROR\n" routerName=sonarr@docker rule="(Host(`MySubDomain.MyDomain.com`) && PathPrefix(`/sonarr`))" providerName=myresolver.acme ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory"

Here's the changes that I've made in the docker compose:

version: "3.9"
services:
  traefik:
    command:
      - --log.level=DEBUG
  qbittorrent:
      # network_mode: "service:vpn"
      # depends_on:
      #   vpn:
      #     condition: service_healthy
      labels:
        - homepage.widget.url=http://qbittorrent:8080
  # vpn:
  jellyfin:
    # devices:
    #   - /dev/dri/renderD128:/dev/dri/renderD128
    #   - /dev/dri/card0:/dev/dri/card0
AdrienPoupa commented 1 year ago

Hi, I see you are querying the staging Let's Encrypt server. Is that intended?

Otherwise, Googling the issue does not yield lots of good results, with people saying it's coming from the network, or it is a model MTU issue, or people saying you need to delete and add DNS records again: https://community.traefik.io/t/could-not-find-the-start-of-authority-acme-dns/13978

lucyannofrota commented 1 year ago

The staging server is intentional for test purposes, and I assume it should be able to emit the certificate pretty much like the prod server.

I think It should be a problem with the DNS, either the API tokens or the Records. Can you share the DNS records and the API token configurations that you used to get your certificates working? (without sensitive information of course!).

Token name          Permissions      Resources     Status
DNS API EDIT        Zone.DNS         1 Zone        Active
ZONE API READ       Zone.Zone        All zones     Active

Any tips or suggestions on how to debug this are welcome.

AdrienPoupa commented 1 year ago

Where do you want to access your server from? I don't have any CNAMEs, just an A subdomain record pointing to my internal private IP. If the config is pointing to the CNAME record that could be the issue.

My token looks similar, anyway if it was a token permission issue I think you would get a different error

image

lucyannofrota commented 1 year ago

First of all, thanks for your attention @AdrienPoupa. I really appreciate your effort.

I solved the problem by adding the following line:

version: "3.9"
services:
  traefik:
    command:
      - --certificatesresolvers.myresolver.acme.dnschallenge.resolvers=1.1.1.1:53,8.8.8.8:53

It seems to be a problem to resolve with the FQDN authority. [Traefik documentation]

Here are a few related issues that helped me to get to the solution for anyone who wants to go deeper:

https://letsdebug.net/ -> A great tool to see if your requests are right

Maybe it's a good idea to add --certificatesresolvers.myresolver.acme.dnschallenge.resolvers=1.1.1.1:53,8.8.8.8:53 to the main branch. It will avoid problems to resolve the DNS in cases of cloudflare (1.1.1.1) and google (8.8.8.8).

AdrienPoupa commented 1 year ago

Turns out it's always either a cache issue or a DNS issue ;)

Great investigation, thanks for reporting your findings. I have updated the configuration with your suggestion.