Add a Letsencrypt resource

Proposal

Have a resource that is able to manage Letsencrypt certificates via ACME v2 API.

Follows a brief overview of Lestencrypt.

Protocol

Letsencrypt designed a (recently become) internet standard protocol to obtain Root CA-signed TLS certificates based on domain validation: you prove you own a domain with a challenge and you are given a TLS certificate to use for free on your websites.

Challenges

The challenges currently supported are:

.well-know or http01: a token file should be serverd by a webserver behind the domain
dns01: a TXT record with a token should be resolved in the domain DNS
tls-alpn01: same as http01 but with a secure server (with TLS) and using a custom ALPN protocol, supersedes the deprecated tls01 challenge

The setup of challenges can be changed or mixed for every domain, but it's usually kept as-is once the automation is in place.

Certificate renewal

The certificates expires in 90 days, but they can be renewed at any time respecting the API limits, the new resource could be executed regularly with thresholds to avoid hitting this limits tuned on the number of certificates to handle.

Implementation

The resource should be configured at minimum with:

a list of certificates to obtain/manage
the type of challenge

from there, the create/update/delete lifecycle should be designed to let the user:

create new certificates from scratch, getting the key and cert in PEM encoded format (this is the most widely used in webservers)
ensure a certificate is valid and renew it
add certificates not previously managed
remove (thus revoke) certificates

@inge4pres Thank you for opening the design discussion!

A few questions: 1) Does one also need to have a letsencrypt account to use the API, or is being able to satisfy the challenge sufficient?

2) Seeing as letsencrypt will sign some object (the public cert), do you agree that building such a cert should be outside of the scope of this initial resource? My guess would be yes, what are your thoughts?

3) Could you propose an initial struct API for the resource? Eg:

type LetsEncryptRes struct{

...

}

A few thoughts:

1) Assuming we take the path to the public cert to sign, then we'd also want to use the recwatch package (look at the file resource) to Watch that file, and re-issue if it ever changes.

2) I guess renewal should happen automatically right before expiry... Perhaps even letting the user specify that threshold. So that's just a giant time delta calculation and a sleep in Watch.

3) Can you expand more about revocation and how that would look?

Thanks, looking forward to this!

Thanks @purpleidea for guiding me throughout the process! I had a closer look to the code and to answer your questions:

you create an account paired with an email that usually is the same email by which the domain is registered, but it's not mandatory. Creating the account is as easy as calling the API to create an account and passing the email in, if no account was previously created you will be given a JSON file with your "credentials" so to say.
the resource itself should represent a certificate paired with the "credentials" of step one, actually the 1:1 pairing is with the private key that is associated with the account but can be granted from scratch for the same domain with a new account. You never create resources prior to calling the API, and using one of the ACME clients already present this operations are 100% transparent so probably the certificate file should be part of the resource (as an artifact? consider I am still reading the traits part, maybe related)

There are a lot of details in implementing Res that I still need to dig into, but I'd say something like

type ACMERes struct {
traits.Base

Mail string
Challenge string // instead of string maybe a custom type to be injected and private
Domain string // main domain
SubjAltNames []string // SANs attached to the certificate for Domain
defaultExpirationCheck time.Duration // to cycle the cert even if not expired
accountPath string // where to store the account file, private key, etc...
}

The ACMERes name is to emphasise on the standard protocol and decouple from the LetsEncrypt Org which is the only provider of the free service for now, but that might change in the future.

This could be enough to provide a bare implementation for the http01 challenge, the most commonly used; for dns01 additional auth to the DNS provider should be declared, I guess there's already a way to inject credentials in the resource?

On the thoughts:

we don't sign a pre-existing file, the cert is re-generated at every renewal; the way the official certbot client does it is with symlinks, we should find a similar implementation?
is there a way to cron the check instead of Watching continuously?
to be honest I never revoked one in 2 years using the service! But I can setup some tests to see how that works... I imagine you issue an API call and the certificate is distruted by the root and fails OCSP stapling? Will detail better as soon as I have more details.

Let me know if any of the above makes sense. Thanks

you create an account paired with an email

It's perfectly fine if the initial account creation process is manual and done by the user, however in an ideal world, the "thing" that goes into the local resource should be some sort of public value, so that if it's stored in git, it's not a security issue.

the resource itself should represent a certificate paired with the "credentials" of step one, actually the 1:1 pairing is with the private key that is associated with the account but can be granted from scratch for the same domain with a new account. You never create resources prior to calling the API, and using one of the ACME clients already present this operations are 100% transparent so probably the certificate file should be part of the resource (as an artifact? consider I am still reading the traits part, maybe related)

I didn't quite understand this section, although I don't think it's related to traits. I'll probably figure it out in code review though :)

The ACMERes name is to emphasise on the standard protocol and decouple from the LetsEncrypt Org

Good point, I didn't understand this before. If there's a more intuitive name for the resource, that would be excellent though. Even AcmeCertRes or something to help it be slightly more descriptive without being too long.

we don't sign a pre-existing file, the cert is re-generated at every renewal

We generate our own cert locally though, right? And then the public part of that gets signed, correct? If so I would expect that certificate generation itself could be a separate resource, and that we could assume initially for this resource that someone else did that, and put the files in /some/path/blah.whatever

to be honest I never revoked one in 2 years using the service

Let's look into that as "part two". Probably not needed initially, but it could be exposed via a state => "revoked" type field.

Looking forward to having this. My biggest concern is getting the design right if possible so that we never have to have any private data passed in. IOW, I'd like to avoid resources that would end up looking like:

SomeRes "foo1" {
  account => "name@example.com",
  password => "hunter2", # I want to avoid this kind of thing or similar
}

Thanks!

[EDIT] Added some clarifications + typos fixed.

Ex-PKI operator of a well-known (and WebTrust** audited) environment.

A few design comments around certificates, design & life-cycle - within the context of certificates used for TLS (SSL).

remove (thus revoke) certificates

The common life-cycle of a (non root or intermediate) certificate ends due to one of the following reasons: 1) Certificate is no longer in use, but private key is still secure. 2) Security of private key got compromised, hence certificate needs to be replaced.

WebTrust mandates revocation for scenario 2), for scenario 1) the deletion of the private key is sufficient. The major argument from PKI operators against revocation in scenario 1) is that CRL (Certificate Revocation List) size is critical for TLS handshake duration. Adding certificates to the revocation list increases its length. In worst case (no OCSP usage, enforced revocation checks) CRLs containing non-compromised keys create additional overhead due to increased filesize every time a TLS client connects to a TLS server using a certificate issued by this particular CA. See [1] for Let's Encrypts recommendation on the topic of revocation.

Certificates can only be removed from a CRL after they have been expired (think CRL garbage collection). Revocation is an intentional one-way operation. My recommendation to prevent accidental inflation of CRL size would be to not implement revocation at all. If removal via MCL deletes the associated private key only no harm was done, an additional info dialog about revocation might be useful.

The resource should be configured at minimum with:

a list of certificates to obtain/manage

the type of challenge

I would want one more recommendations to ensure a good workflow:

As time synchronization at scale (read all devices that could possibly connect to a TLS server) is an unsolved issue (see [2] for impact on TLS), it is common to renew certificates ahead of time but let them rest for a bit before placing them in production. Multiple days waiting time can make sense in large uncontrolled client environments. Providing a parameter to configure the "wait period" could be helpful, as I suspect operators would want to use mgmt for automating this step.

It's perfectly fine if the initial account creation process is manual and done by the user Given the account typically gets "created" by providing an email address for the first time and ACKing that one accepts the Terms & Conditions I'm not sure how manual account creation is easily integratable. Also there is no password involved in the "account creation" process.

The email provided is also used for expiration notifications from Let's Encrypt. I can imagine reasons*** to send those notifications to separate addresses for different certificates (I'm thinking corporate entity w/ teams owning applications). IMO it makes sense to give the requestor the possibility the configure the notification address per cert. A default (defined when instantiating the ACMERes) could reduce overall MCL verbosity.

We generate our own cert locally though, right? And then the public part of that gets signed, correct?

Yes and No. When requesting a TLS certificate, you generate a Certificate Signing Request first, by 1) Creating a key pair (or re-use it) 2) Bundle the pub key together with the Identity (Subject, Subject Alternative Names [SANs], Org, ..), specific Certificate extensions you request to be included into an ASN.1 structure 3) Sign the ASN.1 structure with the private key to prevent modification + provide proof of knowing the matching private key.

The CSR is than given to a Certificate Authority (CA) which does a bunch of checks on it, especially on the identity. The CA takes a decision about signing or not and if they choose to sign, applies a certificate profile that decides over the final structure of the cert. The certificate profile can include different validity periods, additional Certificate extensions and/ or dismiss a requested extension.

If so I would expect that certificate generation itself could be a separate resource

This sounds likely to complicate matters. From what I understand most libraries do the CSR signing transparently. Having a separate resource for managing pub/ private keys might be benefitial though.

ACME/ Let's Encrypt supports renewing a certificate with the same key-pair. Another reason for a pub/ priv key resource I can think of is that it is not certificates that get compromised, but the corresponding private key. Preventing the re-use of a known/ compromised private key is best practice behaviour. The Debian randomness bug [4] resulted in the creation of a tool to explicitly check for know vulnerable private keys.

** The audit standard that makes the difference between a "private CA" and one trusted by browsers. *** Actually seen them in practice , while building a certificate expiration notification system ;-) [1] https://community.letsencrypt.org/t/best-practices-for-when-to-get-a-new-certificate/36135/4 [2] https://www.imperialviolet.org/2016/09/19/roughtime.html [3] https://www.imperialviolet.org/2014/04/19/revchecking.html [4] https://github.com/g0tmi1k/debian-ssh [5] https://wiki.debian.org/SSLkeys#Testing_keys_using_ssh-vulnkey

Great synopsis, thanks! Although some of it was a bit over my head. If this is a resource(s) you want to work on, let me know, I can happily try and answer any of the questions on the mgmt side of things. Cheers!

@purpleidea I'm happy to answer more cert related questions.

I'm definitely interested in building this resource. First step is to write a proper design that outlines the behaviour & integration?

@dantefromhell Sounds good! My recommendation would be to define the resource API (basically the public struct fields) and also mention whether you want to use autogrouping, automatic edges, send/recv or any other resource feature.

Lastly the part that probably needs most clarification is the creds stuff. IOW see my earlier comments and in particular the one about:

password => "hunter2", # I want to avoid this kind of thing or similar

LMK if this all makes sense and post your design here when ready. If you want help figuring this all out, lmk and we can a video chat too.

purpleidea / mgmt