nginx / njs-acme

Nginx NJS module runtime to work with ACME providers like Let's Encrypt for automated no-reload TLS certificate issue/renewal.
Apache License 2.0
57 stars 9 forks source link

Cached entries are not invalidated on renewal #50

Closed NetForce1 closed 5 months ago

NetForce1 commented 5 months ago

Describe the bug

When a certificate is renewed, the cached entries are not invalidated, so the old certificate is still used. And, in the case where a domain name is added to the server block, that old certificate is also used for the new domain name.

To reproduce

Steps to reproduce the behavior:

  1. Have a certificate issued and cached
  2. Add a second domain name to the server block (and $njs_acme_server_names)
  3. Reload (not restart) nginx
  4. Invoke the auto renew endpoint
  5. Visit the newly added domain
  6. The certificate used does not contain the new domain name

Expected behavior

The newly issued certificate should be used immediately for both domains.

Your environment

zsteinkamp commented 5 months ago

Thanks @NetForce1 for the report. I'm looking into this now. Could you post your nginx.conf so I could have a look? If you are using env vars instead of nginx vars to configure the package, could you include those as well?

NetForce1 commented 5 months ago

I justed tested with the container setup in this repository.

This is my reproduction:

  1. Add proxy1.nginx.com as an extra alias to the nginx service in docker-compose.yml
  2. Add an extra location to nginx.conf to force the auto mode (using js_content acme.clientAutoModeHTTP;
  3. Started the dev container
  4. Call the auto-endpoint, or wait a minute for the js_periodic to kick in
  5. Retrieve https://proxy.nginx.com:8443
  6. Add proxy1.nginx.com to $njs_acme_server_names in nginx.conf
  7. Run bash in the nginx container (docker compose exec nginx /bin/bash)
  8. Reload nginx (nginx -s reload)
  9. Call the auto-endpoint, or wait a minute for the js_periodic to kick in
  10. Observe that a new certificate is created (and written to /etc/nginx/njs-acme/proxy.nginx.com.crt) that includes proxy1.nginx.com
  11. Retrieve https://proxy1.nginx.com:8443 using openssl s_client
  12. Decode the returned certificate and observe that it does not include proxy1.nginx.com

I also don't see any invalidation in the code. I would expect it to happen in clientAutoModeInternal in index.ts after the certificate is succesfully written to disk (just before the final return maybe). The only reference to ngx.shared is in readCertOrKey where the items are read from and written to the cache.

edit: Or, instead of adding proxy1.nginx.com you could remove proxy2.nginx.com from $njs_acme_server_names, and re-add it later.

zsteinkamp commented 5 months ago

Ahh ok. Thanks for the additional clarification. I was recreating the container (docker compose up -d --force-recreate nginx) after changing the config. Will try to repro here soon, and I agree with your intuition around the lack of updating the js_shared_dict_zone. Will report back. Thanks!

NetForce1 commented 5 months ago

Adding this at the end of clientAutoModeInternal fixes it:

  const zone = acmeZoneName(r)
  const cache = zone && ngx.shared && ngx.shared[zone]
  if (cache) {
    cache.delete('acme:' + pkeyPath)
    cache.delete('acme:' + certPath)
  }

(and import acmeZoneName of course.

But, I'm not sure this is the cleanest approach, and I also don't know much about how concurrency works in njs. Maybe this can cause a situation where an old certificate is used with a new private key or vice versa.

NetForce1 commented 5 months ago

We could just clear the cache, but that needs njs 0.8.3: https://github.com/nginx/njs/issues/690.

edit: Hm, it looks like clear has another issue. In my tests it hangs when called on an empty dict.

zsteinkamp commented 5 months ago

Thanks @NetForce1 - This should be fixed with the merge above. I have another PR that is adjacent to this, so you may want to wait for that to be merged before taking the time to update your system. We will do a version bump on njs-acme after that's merged.

NetForce1 commented 5 months ago

Awesome, thanks!

zsteinkamp commented 5 months ago

@NetForce1 could you email me at z.steinkamp@f5.com? We have some thanks we'd like to send to you.