NginxProxyManager / nginx-proxy-manager

Docker container for managing Nginx proxy hosts with a simple, powerful interface
https://nginxproxymanager.com
MIT License
22.98k stars 2.66k forks source link

Let's Encrypt HTTP challenge renewal fails with timeout #1549

Closed mwip closed 3 years ago

mwip commented 3 years ago

Describe the bug Since about two months, certbot renewal of letsencypt certificates fails. This is persistent through several versions of NPM now and none of the exisiting issues such as fixing dns inside docker have solved the issue.

Nginx Proxy Manager Version 2.9.10

To Reproduce Steps to reproduce the behavior:

  1. Start docker
  2. wait for renewal of soon expiring certificates
  3. Check docker logs
  4. Errors out with:

    [11/2/2021] [7:07:02 PM] [Migrate  ] › ℹ  info      Current database version: 20210423103500
    [11/2/2021] [7:07:02 PM] [Setup    ] › ℹ  info      Logrotate Timer initialized
    [11/2/2021] [7:07:02 PM] [Setup    ] › ℹ  info      Logrotate completed.
    [11/2/2021] [7:07:02 PM] [IP Ranges] › ℹ  info      Fetching IP Ranges from online services...
    [11/2/2021] [7:07:02 PM] [IP Ranges] › ℹ  info      Fetching https://ip-ranges.amazonaws.com/ip-ranges.json
    [11/2/2021] [7:07:06 PM] [IP Ranges] › ℹ  info      Fetching https://www.cloudflare.com/ips-v4
    [11/2/2021] [7:07:11 PM] [IP Ranges] › ℹ  info      Fetching https://www.cloudflare.com/ips-v6
    [11/2/2021] [7:07:15 PM] [SSL      ] › ℹ  info      Let's Encrypt Renewal Timer initialized
    [11/2/2021] [7:07:15 PM] [SSL      ] › ℹ  info      Renewing SSL certs close to expiry...
    [11/2/2021] [7:07:15 PM] [IP Ranges] › ℹ  info      IP Ranges Renewal Timer initialized
    [11/2/2021] [7:07:15 PM] [Global   ] › ℹ  info      Backend PID 249 listening on port 3000 ...
    `QueryBuilder#allowEager` method is deprecated. You should use `allowGraph` instead. `allowEager` method will be removed in 3.0
    `QueryBuilder#eager` method is deprecated. You should use the `withGraphFetched` method instead. `eager` method will be removed in 3.0
    QueryBuilder#omit is deprecated. This method will be removed in version 3.0
    Model#$omit is deprected and will be removed in 3.0.
    [11/2/2021] [7:09:05 PM] [SSL      ] › ✖  error     Error: Command failed: certbot renew --non-interactive --quiet --config "/etc/letsencrypt.ini" --preferred-challenges "dns,http" --disable-hook-validation
    Failed to renew certificate npm-4 with error: Some challenges have failed.
    Failed to renew certificate npm-5 with error: Some challenges have failed.
    All renewals failed. The following certificates could not be renewed:
    /etc/letsencrypt/live/npm-4/fullchain.pem (failure)
    /etc/letsencrypt/live/npm-5/fullchain.pem (failure)
    2 renew failure(s), 0 parse failure(s)
    
    at ChildProcess.exithandler (node:child_process:397:12)
    at ChildProcess.emit (node:events:390:28)
    at maybeClose (node:internal/child_process:1064:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:301:5)

Expected behavior Certbot will automatically renew expiring certificates

Operating System Linux 5.13.0-arm64 #1 SMP PREEMPT Debian 5.13.15-202109101456~buster (2021-09-10) aarch64 GNU/Linux

Additional context I use the following ports, as my NPM is installed alongside Nextcloudpi, which by default occupies default HTTP(s) ports 80/443. Externally, ports 443 and 80 point towards 40443 and 8080, respectively. However, this was not a problem earlier (prior to ~August/September 2021).

version: '3'
services:
  app:
    image: 'jc21/nginx-proxy-manager:latest'
    restart: always
    ports:
      - '8080:80'
      - '8081:81'
      - '40443:443'
...
chaptergy commented 3 years ago

Unfortunately certbot does not output much information in the command line. Have a look at https://github.com/jc21/nginx-proxy-manager/issues/1271#user-content-certificate-error and tell us what the letsencrypt logs say

mwip commented 3 years ago

Thanks so much for the hint and sorry for missing this. Checking the letsencrypt logs revealed that the renewal fails due to the DNS challenge being invalid. That is reasonable, since I never set it up anyways. So I found https://letsencrypt.org/docs/challenge-types/ which leads me to believe, that for my use case simple HTTP-01 challenge is sufficient.

Maybe you could help me with the following question: • is there any particular reason why NPM chooses DNS challenge by default? • can I alter existing certificates to only use HTTP-01 challenges? • and is there any immediate security implication to using HTTP-01 challenges only that I am missing?

Also, I think the following improvements could be added to NPM (probably deserving their own issues): • add an info box on the toggle Use DNS Challenge to https://letsencrypt.org/docs/challenge-types/ • it is possible to generate a certificate on the fly when setting up a new proxy host. There is no mention of the DNS challenge. There should be a toggle as well, right?

chaptergy commented 3 years ago

Okay, that's weird. About your questions:

I'm not really sure what the issue could be. Could you provide us with the relevant part of the letsencrypt log and the renewal config? Replace any sensitive information with placeholders of course.

mwip commented 3 years ago

Thanks so much for your help!

So I watched the logs while (successfully) creating a new certificate and deliberately not activating the DNS challenge. This triggered the following log in the docker app:

[11/3/2021] [8:05:18 PM] [SSL      ] › ℹ  info      Command: certbot certonly --non-interactive --config "/etc/letsencrypt.ini" --cert-name "npm-8" --agree-tos --authenticator webroot --email "dummy@blanked.com" --preferred-challenges "dns,http" --domains "test.my-domain.tld"

Is it normal that certbot will include --preferred-challenges "dns,http" even though the DNS challenge was not ticked in NPM? I also tried to renew this newly created certificate and eveything worked fine, no LE-logs just a success log for NPM. Does that maybe mean that I am just better off by replacing all certificates?

Concerning previous certificates, please find the logs. I hope I did not blank relevant stuff. The part that tripped me up is that challenges contains { "type": "dns-01", "status": "pending", "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/asdf", "token": "asdf" },. Also, in /etc/letsencrypt/renewal/npm-5.conf there is pref_challs = dns-01, http-01 while NPM's certificate table just says Let's Encrypt as Certificate Provider.

What do you mean by

But if you don't want to expose your npm instance publicly, [...]?

NPM, and the sites it proxies are exposed via ports 80 and 443 (external) and forwarded to 8080 and 40443 internally, but that should not matter, right? Port 81 (NPM interface) is not exposed externally.

Logs

2021-11-03 20:11:55,694:DEBUG:certbot._internal.plugins.selection:Requested authenticator webroot and installer None
2021-11-03 20:11:55,711:DEBUG:certbot._internal.plugins.selection:Single candidate plugin: * webroot
Description: Place files in webroot directory
Interfaces: Authenticator, Plugin
Entry point: webroot = certbot._internal.plugins.webroot:Authenticator
Initialized: <certbot._internal.plugins.webroot.Authenticator object at 0xffffbc593e48>
Prep: True
2021-11-03 20:11:55,712:DEBUG:certbot._internal.plugins.selection:Selected authenticator <certbot._internal.plugins.webroot.Authenticator object at 0xffffbc593e48> and installer None
2021-11-03 20:11:55,713:INFO:certbot._internal.plugins.selection:Plugins selected: Authenticator webroot, Installer None
2021-11-03 20:11:55,749:DEBUG:certbot._internal.main:Picked account: <Account(RegistrationResource(body=Registration(key=None, contact=(), agreement=None, status=None, terms_of_service_agreed=None, only_return_existing=None, external_account_binding=None), uri='https://acme-v02.api.letsencrypt.org/acme/acct/107220101', new_authzr_uri=None, terms_of_service=None), asdf, Meta(creation_dt=datetime.datetime(2020, 12, 23, 12, 6, 43, tzinfo=<UTC>), creation_host='72bcc8d98e39', register_to_eff=None))>
2021-11-03 20:11:55,752:DEBUG:acme.client:Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
2021-11-03 20:11:55,760:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
2021-11-03 20:12:00,295:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "GET /directory HTTP/1.1" 200 658
2021-11-03 20:12:00,297:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Wed, 03 Nov 2021 20:12:00 GMT
Content-Type: application/json
Content-Length: 658
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
  "H0-MQorH-6Q": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}
2021-11-03 20:12:00,301:DEBUG:certbot._internal.display.obj:Notifying user: Renewing an existing certificate for my-domain.tld
2021-11-03 20:12:00,414:DEBUG:certbot.crypto_util:Generating ECDSA key (2048 bits): /etc/letsencrypt/keys/1776_key-certbot.pem
2021-11-03 20:12:00,526:DEBUG:certbot.crypto_util:Creating CSR: /etc/letsencrypt/csr/1776_csr-certbot.pem
2021-11-03 20:12:00,528:DEBUG:acme.client:Requesting fresh nonce
2021-11-03 20:12:00,528:DEBUG:acme.client:Sending HEAD request to https://acme-v02.api.letsencrypt.org/acme/new-nonce.
2021-11-03 20:12:00,693:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "HEAD /acme/new-nonce HTTP/1.1" 200 0
2021-11-03 20:12:00,695:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Wed, 03 Nov 2021 20:12:00 GMT
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: asdf
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

2021-11-03 20:12:00,695:DEBUG:acme.client:Storing nonce: asdf
2021-11-03 20:12:00,697:DEBUG:acme.client:JWS payload:
b'{\n  "identifiers": [\n    {\n      "type": "dns",\n      "value": "my-domain.tld"\n    }\n  ]\n}'
2021-11-03 20:12:00,707:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/new-order:
{
  "protected": "asdf",
  "signature": "asdf",
  "payload": "asdf"
}
2021-11-03 20:12:01,036:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "POST /acme/new-order HTTP/1.1" 201 335
2021-11-03 20:12:01,038:DEBUG:acme.client:Received response:
HTTP 201
Server: nginx
Date: Wed, 03 Nov 2021 20:12:00 GMT
Content-Type: application/json
Content-Length: 335
Connection: keep-alive
Boulder-Requester: 107220101
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Location: https://acme-v02.api.letsencrypt.org/acme/order/107220101/36894884650
Replay-Nonce: asdf
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
  "status": "pending",
  "expires": "2021-11-10T20:12:00Z",
  "identifiers": [
    {
      "type": "dns",
      "value": "my-domain.tld"
    }
  ],
  "authorizations": [
    "https://acme-v02.api.letsencrypt.org/acme/authz-v3/asdf"
  ],
  "finalize": "https://acme-v02.api.letsencrypt.org/acme/finalize/asdf"
}
2021-11-03 20:12:01,039:DEBUG:acme.client:Storing nonce: asdf
2021-11-03 20:12:01,040:DEBUG:acme.client:JWS payload:
b''
2021-11-03 20:12:01,048:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/45939634400:
{
  "protected": "asdf",
  "signature": "asdf",
  "payload": ""
}
2021-11-03 20:12:01,246:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "POST /acme/authz-v3/45939634400 HTTP/1.1" 200 793
2021-11-03 20:12:01,248:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Wed, 03 Nov 2021 20:12:01 GMT
Content-Type: application/json
Content-Length: 793
Connection: keep-alive
Boulder-Requester: 107220101
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: asdf
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
  "identifier": {
    "type": "dns",
    "value": "my-domain.tld"
  },
  "status": "pending",
  "expires": "2021-11-10T20:12:00Z",
  "challenges": [
    {
      "type": "http-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/asdf",
      "token": "asdf"
    },
    {
      "type": "dns-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/asdf",
      "token": "asdf"
    },
    {
      "type": "tls-alpn-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/asdf",
      "token": "asdf"
    }
  ]
}

This seems to keep going for a few tries until:

Server: nginx
Date: Wed, 03 Nov 2021 20:12:25 GMT
Content-Type: application/json
Content-Length: 1858
Connection: keep-alive
Boulder-Requester: 107220101
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: asdf
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
  "identifier": {
    "type": "dns",
    "value": "my-domain.tld"
  },
  "status": "invalid",
  "expires": "2021-11-10T20:12:00Z",
  "challenges": [
    {
      "type": "http-01",
      "status": "invalid",
      "error": {
        "type": "urn:ietf:params:acme:error:connection",
        "detail": "Fetching https://my-domain.tld/.well-known/acme-challenge/asdf: Timeout during connect (likely firewall problem)",
        "status": 400
      },
      "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/asdf",
      "token": "asdf",
      "validationRecord": [
        {
          "url": "http://my-domain.tld/.well-known/acme-challenge/asdf",
          "hostname": "my-domain.tld",
          "port": "80",
          "addressesResolved": [
            "some_ipv4_address",
            "some_ipv6_address"
          ],
          "addressUsed": "some_ipv6_address"
        },
        {
          "url": "http://my-domain.tld/.well-known/acme-challenge/asdf",
          "hostname": "my-domain.tld",
          "port": "80",
          "addressesResolved": [
            "some_ipv4_address",
            "some_ipv6_address"
          ],
          "addressUsed": "some_ipv4_address"
        },
        {
          "url": "https://my-domain.tld/.well-known/acme-challenge/asdf",
          "hostname": "my-domain.tld",
          "port": "443",
          "addressesResolved": [
            "some_ipv4_address",
            "some_ipv6_address"
          ],
          "addressUsed": "some_ipv6_address"
        }
      ],
      "validated": "2021-11-03T20:12:01Z"
    }
  ]
}
2021-11-03 20:12:25,204:DEBUG:acme.client:Storing nonce: asdf
2021-11-03 20:12:25,205:INFO:certbot._internal.auth_handler:Challenge failed for domain my-domain.tld
2021-11-03 20:12:25,206:INFO:certbot._internal.auth_handler:http-01 challenge for my-domain.tld
2021-11-03 20:12:25,207:DEBUG:certbot._internal.display.obj:Notifying user:
Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
  Domain: my-domain.tld
  Type:   connection
  Detail: Fetching https://my-domain.tld/.well-known/acme-challenge/asdf: Timeout during connect (likely firewall problem)

Hint: The Certificate Authority failed to download the temporary challenge files created by Certbot. Ensure that the listed domains serve their content from the provided --webroot-path/-w and that files created there can be downloaded from the internet.

2021-11-03 20:12:25,208:DEBUG:certbot._internal.error_handler:Encountered exception:
Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/auth_handler.py", line 90, in handle_authorizations
    self._poll_authorizations(authzrs, max_retries, best_effort)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/auth_handler.py", line 178, in _poll_authorizations
    raise errors.AuthorizationError('Some challenges have failed.')
certbot.errors.AuthorizationError: Some challenges have failed.

2021-11-03 20:12:25,209:DEBUG:certbot._internal.error_handler:Calling registered functions
2021-11-03 20:12:25,209:INFO:certbot._internal.auth_handler:Cleaning up challenges
2021-11-03 20:12:25,210:DEBUG:certbot._internal.plugins.webroot:Removing /data/letsencrypt-acme-challenge/.well-known/acme-challenge/asdf
2021-11-03 20:12:25,211:DEBUG:certbot._internal.plugins.webroot:All challenges cleaned up
2021-11-03 20:12:25,212:ERROR:certbot._internal.renewal:Failed to renew certificate npm-4 with error: Some challenges have failed.
2021-11-03 20:12:25,216:DEBUG:certbot._internal.renewal:Traceback was:
Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/renewal.py", line 475, in handle_renewal_request
    main.renew_cert(lineage_config, plugins, renewal_candidate)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 1386, in renew_cert
    renewed_lineage = _get_and_save_cert(le_client, config, lineage=lineage)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 122, in _get_and_save_cert
    renewal.renew_cert(config, domains, le_client, lineage)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/renewal.py", line 335, in renew_cert
    new_cert, new_chain, new_key, _ = le_client.obtain_certificate(domains, new_key)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/client.py", line 384, in obtain_certificate
    orderr = self._get_order_and_authorizations(csr.data, self.config.allow_subset_of_names)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/client.py", line 434, in _get_order_and_authorizations
    authzr = self.auth_handler.handle_authorizations(orderr, self.config, best_effort)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/auth_handler.py", line 90, in handle_authorizations
    self._poll_authorizations(authzrs, max_retries, best_effort)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/auth_handler.py", line 178, in _poll_authorizations
    raise errors.AuthorizationError('Some challenges have failed.')
certbot.errors.AuthorizationError: Some challenges have failed.
mwip commented 3 years ago

Ok, I why certificates would not be renewed: for the proxy hosts in question, I had the option "Force SSL" active. Once I deactivated this option in the SSL Tab of the respective hosts, I was able to renew all certificates. :heavy_check_mark:

Does leave this issue to be closed for "user error" (that could be totally on me. Sorry for spamming here...), or would that indicate a deeper problem worth keeping this issue around?

chaptergy commented 3 years ago

So that means enabling Force SSL caused the certificates to fail to renew?

mwip commented 3 years ago

Right now, I just can definitively state the inverse: disabling Force SSL made renewal possible. I tested it for 3 proxy hosts, all worked afterwards. I can test enabling Force SSL once more and check if that really is what caused the hickup

mwip commented 3 years ago

I am so confused and embarrassed... I tested whether enabling any of the SSL options Force SSL, HSTS Enabled and HSTS Subdomains would break the renewal process. I activated them one after the other and renewed the certificate every single time with success. At this point I don't know what has caused the incident, I am sorry. But maybe I'll write up the (admittedly super hacky) solution for future reference in a closing comment, if you think it could be helpful to the community @chaptergy. Just LMK, else feel free to close.

chaptergy commented 3 years ago

Hm, so now everything works no matter the state of any of the settings? You can't replicate the issue anymore? Then I'll go ahead and close this issue for now. But feel free to add anything that could be helpful in a comment.

jamestutton commented 2 years ago

Just for reference. I have seen this behavior before if for some reason the existing certificate has lapsed and Force SSL is on then yes renewal will fail as it is forced to use a certificate that is expired and hence cant renew as the SSL in invalid. Maybe that is the situation you found yourself in.

vinc32 commented 1 year ago

thx for the hint - disabling Force SSL let me renew all SSL Certs

aszurnasirpal commented 1 year ago

I was affected by the same bug. Disabling the SSL Force option allowed me to renew the cert as well.

0rn0lf commented 1 year ago

Same here. After finding this issue today, i tried disabling "Force SSL" which indeed did the trick for 10 expired certificates. I kept recreateing certificates for over a year now without finding the issue.

I also tried one certificate which is still valid until May. With "Force SSL" enabled, renewal didnt work. As soon as i disabled the option, renewal worked.

Nevertheless, it cant be expired certificates because NPM should renew them before they expire.

DorCoMaNdO commented 1 year ago

I've had the same issue via the unRaid Docker container up until today, instead of disabling Force SSL I instead added a custom location rule (in the proxy host settings) with the following settings: Location: /.well-known/acme-challenge Scheme: http Forward Hostname/IP: [YourNginxServerIP]/data/letsencrypt-acme-challenge (do not add your Nginx custom port here) Forward Port: Your Nginx Port

This resolved the issue where the challenge files generated by the certification process could not be accessed by the remote host, my previous solution was to disable the proxy host temporarily, generate new certificate, and then re-enable it, only had to be done once every 3 months but it was still nonsensical.

soenkef commented 1 year ago

Ok, I why certificates would not be renewed: for the proxy hosts in question, I had the option "Force SSL" active. Once I deactivated this option in the SSL Tab of the respective hosts, I was able to renew all certificates. ✔️

Does leave this issue to be closed for "user error" (that could be totally on me. Sorry for spamming here...), or would that indicate a deeper problem worth keeping this issue around?

THis works for me!

Unskilledcrab commented 7 months ago

I've ran into this issue twice now and just found this solution. The previous time I completely removed NPM and re-installed thinking there was something wrong with the installation. This saved me from having to repeat the process, thank you!