khonsulabs / bonsaidb

A developer-friendly document database that grows with you, written in Rust
https://bonsaidb.io/
Apache License 2.0
1.01k stars 37 forks source link

ACME renewal logic is wrong #259

Open ecton opened 2 years ago

ecton commented 2 years ago

I received an alert from LetsEncrypt one of the servers that's running ACME issued through BonsiaDb hasn't renewed its certificate. Last night, I ssh'ed in and updated the server and rebooted it, figuring that would trigger the renewal logic for sure.

This morning, I noticed it still hadn't updated. I tracked it down to the use of a helper function from the acme library. I assumed it did something a little different than what it does -- it returns half the duration remaining on the certificate.

The way the loop is currently written, the err_cnt parameter is never incremented, which means this function will always return half the remaining time on the certificate -- not the duration until the halfway point of the certificate's lifetime. That's where my misunderstanding came from.

We should move the renewal logic from being completely stateless to tracking the next renewal timestamp on the TlsCertificate entity. We should keep track of the renewal attempts as well, to help with debugging certificate issuance problems.

ecton commented 2 years ago

On second thought, the scheduling of this could be a good job service (#78) function. Regardless, the logic should be fixed in the loop to make sure we actually try renewing starting 2 weeks out.