NginxProxyManager / nginx-proxy-manager

Docker container for managing Nginx proxy hosts with a simple, powerful interface
https://nginxproxymanager.com
MIT License
21.91k stars 2.53k forks source link

Certbot failed to delete some certs after revoke #2103

Open rekyuu opened 2 years ago

rekyuu commented 2 years ago

Describe the bug I recently updated all my locally hosted services to be internet inaccessible and require a VPN to use. As a result, I updated my NPM configuration to use HTTP only and deleted all my HTTPS certs, and removed all applicable DNS entries.

For some (not all) certs, the certbot revoke --delete-after-revoke command did not seem to take fully, as the certs remained active. Since the DNS entries were removed, this caused the hourly certbot renew command to fail. Failing enough times (for some reason) caused my NPM instance to lock up and my local services to time out.

I was able to resolve this by going into the container and manually calling certbot delete on the remaining active certificates.

Nginx Proxy Manager Version v2.9.18

To Reproduce Steps to reproduce the behavior:

  1. Remove certificates from Proxy Host entries
  2. Delete the certificate from the SSL Certificates page

Expected behavior The certificates should be removed after being revoked.

Operating System

$ uname -a
Linux raspberrypi 5.10.103-v7l+ #1529 SMP Tue Mar 8 12:24:00 GMT 2022 armv7l GNU/Linux

$ docker -v
Docker version 20.10.16, build aa7e414

$ docker ps
CONTAINER ID   IMAGE                             COMMAND        CREATED          STATUS                       PORTS                                                                                                                                            NAMES
------------   jc21/nginx-proxy-manager:latest   "/init"        12 minutes ago   Up 12 minutes                0.0.0.0:80-81->80-81/tcp, :::80-81->80-81/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp                                                             nginx-proxy-manager-app-1

Additional Context

image

image

[6/9/2022] [1:46:56 PM] [SSL      ] › ℹ  info      Let's Encrypt Renewal Timer initialized
[6/9/2022] [1:46:56 PM] [SSL      ] › ℹ  info      Renewing SSL certs close to expiry...
[6/9/2022] [1:46:56 PM] [IP Ranges] › ℹ  info      IP Ranges Renewal Timer initialized
[6/9/2022] [1:46:56 PM] [Global   ] › ℹ  info      Backend PID 244 listening on port 3000 ...
[6/9/2022] [1:49:38 PM] [SSL      ] › ✖  error     Error: Command failed: certbot renew --non-interactive --quiet --config "/etc/letsencrypt.ini" --preferred-challenges "dns,http" --disable-hook-validation  
Failed to renew certificate npm-12 with error: Some challenges have failed.
Failed to renew certificate npm-13 with error: Some challenges have failed.
Failed to renew certificate npm-14 with error: Some challenges have failed.
Failed to renew certificate npm-15 with error: Some challenges have failed.
Failed to renew certificate npm-16 with error: Some challenges have failed.
Failed to renew certificate npm-6 with error: Some challenges have failed.
Failed to renew certificate npm-7 with error: Some challenges have failed.
All renewals failed. The following certificates could not be renewed:
  /etc/letsencrypt/live/npm-12/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-13/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-14/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-15/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-16/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-6/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-7/fullchain.pem (failure)
7 renew failure(s), 0 parse failure(s)

    at ChildProcess.exithandler (node:child_process:399:12)
    at ChildProcess.emit (node:events:526:28)
    at maybeClose (node:internal/child_process:1092:16)
    at Socket.<anonymous> (node:internal/child_process:451:11)
    at Socket.emit (node:events:526:28)
    at Pipe.<anonymous> (node:net:687:12)
rekyuu commented 2 years ago

This is probably an issue with certbot itself, but I wasn't able to find any open issues about it. My suggestion would be to implement certificate validation across certificates marked as active in the NPM database and what's returned by certbot certificates.

the1ts commented 2 years ago

Seems that there is a logic issue with the revoking and removal, I see from my logs

[6/7/2022] [11:32:33 AM] [SSL ] › ℹ info Command: certbot revoke --config "/etc/letsencrypt.ini" --cert-path "/etc/letsencrypt/live/npm-95/fullchain.pem" --delete-after-revoke ; rm -f '/etc/letsencrypt/credentials/credentials-95' || true

This command means that even if the certbot revoke fails, the certs would still be removed causing the error seen, surely the semicolon should be a && to only do the delete if revoke is successful. This needs capturing by the backend to update the DB only if successful. The || true also means that a good return code is given on failure which is suspicious and suggests error handling isn't done here.

RavenLiao commented 2 years ago

I have a similar problem. After delete the certificate that still using for proxy from the SSL Certificates page, the proxy sites show http only. but when I change the site SSL certs, it show internal error. Then I reboot the container, it can't work and the log show that nginx can't find the old cert's files. Finally, I copy the new cert's files and rename it as the same with old cert. And reboot the container, I change the site SSL certs successfully. So, the delete cert logic leave out change the nginx configuration files.

github-actions[bot] commented 6 months ago

Issue is now considered stale. If you want to keep it open, please comment :+1: