sasa1977 / site_encrypt

Integrated certification via Let's encrypt for Elixir-powered sites
MIT License
470 stars 34 forks source link

Add change-detection for list of domains and `expand` functionality in certbot client #34

Closed bucha closed 3 years ago

bucha commented 3 years ago

Hi,

I'm running a tiny service with Phoenix and SiteEncrypt where an admin can add domains via a separate endpoint which should redirect to several marketing deep-URLs. As I didn't want to fiddle around with a webserver in front of the service including all the letsencrypt hassle I installed SiteEncrypt in my service but soon found out, that it wouldn't request new certificates whenever I added a domain and ran SiteEncrypt.force_certify/1.

Therefore I've added SiteEncrypt.certificate_subjects_changed?/1 which checks for a change between the domains of the configuration and the domains (i.e. subject_alt_name) of the certificate.

I've added that check in SiteEncrypt.Certification.start_initial_renewal/1 after the call to Periodic.cert_due_for_renewal?/1 in order to kick off the renewal.

Within the renewal of the certbot client I've added the call to SiteEncrypt.certificate_subjects_changed?/1 again to decide whether we need to run SiteEncrypt.Certification.Certbot.certonly/2 or a newly introduced SiteEncrypt.Certification.Certbot.expand/2. The latter only differing from the former by the argument --expand in the cli command.

Please let me know what you think.

Thank you and best regards, Alex

sasa1977 commented 3 years ago

Thanks for the PR!

Looking at the code, my first question is, is this working as intended? The thing is that config is cached during the endpoint startup, so I'm not sure how a config change would be picked up, unless you called these functions (which are currently not the part of the public API) manually?

I wonder if for better UX we should add another function, e.g. SiteEncrypt.update_config(id, config), which will store the new config and start the force renewal. We'd then need a test that invokes update_config, and verifies that the new certification matches the config changes. It could look similar to the force_certify test, with proper changes.

What do you think?

bucha commented 3 years ago

Hi Saša,

it's funny you ask that question because it's not quite obvious why it should work. In my case however, I had to quickly assemble this service in a couple of hours and therefore just set up a dynamic supervisor which I use to restart the endpoint alognside SiteEncrypt and therefore everything that is attached to it (guess I might loose a couple of requests that way). My endpoint is then gathering the list of domains from the database in certification/0, which is why it's working in my specific setup.

I greatly favor your proposal of SiteEncrypt.update_config(id, config) and tried to implement it but must admit in the end I failed to do so. My problem in this case is that I don't know how to write to the Registry from outside the initiating process.

(I've actually tried to invoke Registry.update_value/3 from within my iex session for quite some time just to realize that's not possible and remembered that I read about that in your book several years back - duh)

My train of thought is: SiteEncrypt.Phoenix.start_link/1 is starting the child processes including the actual endpoint through SiteEncrypt.Phoenix.start_endpoint/1 which in turn has access to the Registry and calls SiteEncrypt.Registry.store_config/2. That means, I'd need to place my call from within the endpoint's process. How am I going to do that from outside though?

Thank you for your feedback.

Best regards, Alex

sasa1977 commented 3 years ago

I had to quickly assemble this service in a couple of hours and therefore just set up a dynamic supervisor which I use to restart the endpoint alognside SiteEncrypt and therefore everything that is attached to it (guess I might loose a couple of requests that way). My endpoint is then gathering the list of domains from the database in certification/0, which is why it's working in my specific setup.

I see. Yeah, that makes sense, thanks for explaining.

My train of thought is: SiteEncrypt.Phoenix.start_link/1 is starting the child processes including the actual endpoint through SiteEncrypt.Phoenix.start_endpoint/1 which in turn has access to the Registry and calls SiteEncrypt.Registry.store_config/2. That means, I'd need to place my call from within the endpoint's process. How am I going to do that from outside though?

start_endpoint is called from the parent of the endpoint, which is a supervisor process. This process is currently powered by the adapter module which is a Parent.Supervisor. However, there is another branch, support-plug, with a single commit (ed1a23a812900e67f78cf192a47d30ed4bb00dfc). Here, the parent of the endpoint is powered by SiteEncrypt.Adapter, which is Parent.GenServer (basically Supervisor + GenServer). This process (which is also the process that stored the config) can therefore handle regular GenServer calls (in fact, it already does that).

In addition, this process keeps the adapter in its state, so it can actually ask the endpoint to resupply the config. As a result, instead of the proposed update_config/2, we'd have SiteEncrypt.refresh_config(id), which would issue the call to the root process (the endpoint's parent, i.e. the adapter). The corresponding handle_call would in turn get the config again, compare it to the previous version, and if there are changes, store the config (which it can do, because it's the correct process) and force recertify. I think it should work, but I can't make 100% guarantees without actually trying it out :-)

Does that make sense, and are you willing to try it? If yes, then I suggest merging the support-plugbranch into your own and trying the proposed approach. If you get stuck anywhere (including the merge conflicts), ping me, and I'll try to sort it out. Alternatively, I can try this myself, but I can't commit on a deadline.

bucha commented 3 years ago

Hi Saša,

thank you for pointing me in the right direction.

First I've merged the branch support-plug into my branch and had a bit of a conflict to solve around SiteEncrypt.Phoenix.http_port as there had been changes between support-plug and master at the time I forked from master. I hope I've got it right, but please take a closer look at it.

The introduction of SiteEncrypt.Adapter gave me the necessary surface I was searching for in the morning. I've added SiteEncrypt.Adapter.refresh_config/1 to it but also added SiteEncrypt.refresh_config/1 as a shortcut because I thought it'd be a bit fishy if one had to reach out to the adapter specifically to get this job done. Feel free to remove the redundant function.

However, I have not added a call to SiteEncrypt.force_certify/1 after refresh_config because I thought it might lead to unexpected outcomes. The warning about Let's encrypt's quotas ultimately led me to this decision.

Finally I had to quickly add some sort of domain-provider to the tests in order to be able to have a place where the list of domains can dynamically be retrieved by the endpoint.

Please let me know what you think.

sasa1977 commented 3 years ago

Thank you! FYI, I made a few changes on top of your code. I wanted to push them to this PR, but it appears I had no permissions, so I merged your code, and then added my changes as a separate merge commit. In case you're interested, take a look at the commit log.

Thanks again for contributing!