traefik / traefik

The Cloud Native Application Proxy
https://traefik.io
MIT License
51.28k stars 5.1k forks source link

Add more DNS checks before requesting Let's encrypt cert to avoid blacklisting #2775

Open deimosfr opened 6 years ago

deimosfr commented 6 years ago

Do you want to request a feature or report a bug?

feature

What did you do?

I configured Traefik on Kubernetes with let's encrypt on top of consul using this helm chart (https://github.com/MySocialApp/kubernetes-helm-chart-traefik).

Everything worked well for some days for 5 subdomains. A few days, I added a new subdomain as usual and it failed because of the missing TXT checked by Let's encrypt:

Error getting ACME certificates [my.fqdn.com] : cannot obtain certificates map[my.fqdn.com:acme: Error 403 - urn:acme:error:unauthorized - No TXT record found at _acme-challenge.my.fqdn.com

The issue duration was ~7h. I don't think the issue came from Traefik (unless there is no check after adding TXT record, didn't had the time to check the code sorry) because I didn't had any issue with previous tests and current registered certs.

In addition, as Let's encrypt wasn't able to perform the check, my domain has been blacklisted:

Error getting ACME certificates [my.fqdn.com] : cannot obtain certificates map[my.fqdn.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: too many failed authorizations recently: see https://letsencrypt.org/docs/rate-limits/

I don't know which are the DNS used by Let's encrypt to check the TXT records. However, I strongly suggest to perform several DNS check using different providers (most common ones) before requesting a new certs. Until the majority or all those DNS have the record present, wait before cert request.

May be a list of DNS address provided by the user could be a good solution as it permit to let him choose the importance (longer DNS list means critical for him) of getting his SSL.

What did you expect to see?

I'm expecting enough check to avoid being blacklisted

What did you see instead?

An unavailability to request SSL certs for several hours

Output of traefik version: (What version of Traefik are you using?)

Version:      v1.5.0-rc5
Codename:     cancoillotte
Go version:   go1.9.2
Built:        2018-01-15_03:59:03PM
OS/Arch:      linux/amd64

What is your environment & configuration (arguments, toml, provider, platform, ...)?

    checkNewVersion = false
    MaxIdleConnsPerHost = 500
    logLevel = "INFO"
    defaultEntryPoints = ["http", "https"]

    [respondingTimeouts]
    idleTimeout = "180s"
    writeTimeout = "60s"
    readTimeout = "60s"

    [retry]
    attempts = 3

    [web]
    address = ":8081"

    [kubernetes]
    endpoint = "kube"

    [consul]
    endpoint = "consul:8500"
    watch = true
    prefix = "traefik"

    [acme]
    email = "myemail"
    storage = "traefik/acme/account"
    entryPoint = "https"
    OnHostRule = true
    acmeLogging = true
    [acme.dnsChallenge]
    provider = "cloudflare"
    delayBeforeCheck = 20

    [[acme.domains]]
    main = "fqdn.com"

    [entryPoints]
      [entryPoints.http]
      address = ":80"
      compress = true
        [entryPoints.http.redirect]
        entryPoint = "https"
      [entryPoints.https]
      address = ":443"
        [entryPoints.https.tls]
mmatur commented 6 years ago

@deimosfr Thanks for your interest on Træfik.

We are thinking that these things are more relative to xenolf/lego than in Træfik