opnsense / plugins

OPNsense plugin collection
https://opnsense.org/
BSD 2-Clause "Simplified" License
846 stars 642 forks source link

acme-client: haproxy still serves old cert after renewal #4203

Open aque opened 2 months ago

aque commented 2 months ago

Important notices Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug HAProxy continues to serve old Let's Encrypt certificates after it was renewed and updated. ACME Client > Automations has the "Restart HAProxy (OPNsense plugin)" configured and enabled. Restarting HAProxy manually thru the GUI loads the updated certificates.

To Reproduce Steps to reproduce the behavior:

  1. Configure HAProxy to serve Let's Encrypt certificates
  2. Configure ACME Client > Automations to restart HAProxy on renewals
  3. Wait for certificate to renew
  4. Note the certificate served by HAProxy

Expected behavior ACME Client automation that restarts HAProxy loads the updated certificate.

Additional context

  1. I confirmed that ACME client has the new certificate in the /var/etc/acme-client/cert-home/6029e36f2fa175.26395931/host.domain.tld_ecc/host.domain.tld.cer file.
  2. I also confirmed this same new certificate is one of the files listed in /tmp/haproxy/ssl/60294d9f6fa932.93592251.certlist. (same serial number)
  3. openssl s_client confirms that HAProxy still serves the older certificate.
  4. I suspect that HAProxy needs a full restart instead of a SIGHUP.

Environment OPNsense 24.7.2-amd64 FreeBSD 14.1-RELEASE-p3 OpenSSL 3.0.14

aque commented 2 months ago

I found the following entry in /var/log/acmeclient/acmeclient_20240825.log:

<14>1 2024-08-25T02:38:14-05:00 gw acme.sh 5916 - [meta sequenceId="43"] [Sun Aug 25 02:38:14 CDT 2024] Your cert is in: /var/etc/acme-client/cert-home/6029e36f2fa175.26395931/host.domain.tld_ecc/host.domain.tld.cer

Below is from /var/log/haproxy/haproxy_20240825.log around that time. /tmp/haproxy/ssl/6029e37fc87ca.pem is the problem certificate for that day.

<134>1 2024-08-25T02:38:22-05:00 gw haproxy 71632 - [meta sequenceId="20"] -:- [25/Aug/2024:02:38:22.029] <OCSP-UPDATE> /tmp/haproxy/ssl/60295e5d400a1.pem 1 "Update successful" 0 1
<134>1 2024-08-25T02:38:22-05:00 gw haproxy 71632 - [meta sequenceId="21"] -:- [25/Aug/2024:02:38:22.128] <OCSP-UPDATE> /tmp/haproxy/ssl/6029e37fc87ca.pem 1 "Update successful" 0 1

I also get the same OCSP-UPDATE lines on a manual haproxy restart. At this point, I do not know the difference between "Restart HAProxy (OPNsense plugin)" cron job versus a manual restart. If there is no difference, I wonder if sync is needed to flush pending disk writes before haproxy is restarted.

doktornotor commented 2 months ago

Well, I'd suggest inspecting Settings - Service options, reading the help there and experimenting with those options.

aque commented 2 months ago

I went through the HAProxy settings. I think these are the relevant settings to this issue, and below are their values. I do not see anything that would cache the certificate.

I can also try manually running the automation next time this happens again, but I don't know where to find that. I found /usr/local/opnsense/scripts/OPNsense/HAProxy/syncCerts.py sync --output json in Cron actions_haproxy.conf but I am not sure it is equivalent to the acme automation. I can try that next time though to see if there is a difference.

Service

Global settings > SSL settings - I believe these are the defaults.

Cache

meyergru commented 1 month ago

I see the same behaviour - the automation does not restart HAproxy properly.

It seems that actions_haproxy.conf has both reload and restart options, and in the ACME configuration, only a HAproxy restart is offered. However I am not so sure that the very action from the cron definitions is taken when the automations run, because the cron action has slightly different wording from the ACME automation.

The code doing this is rather complex with interfaces, factories and all the good stuff that makes things clearer (tm)...