Three line fix for DANE TLSA DNS records going out of date every 90 days with LE

johnheenan commented 6 months ago

Free LE certificates are commonly used with Virtualmin. They need to be renewed every 90 days. DANE TLSA DNS records do not need to be updated provided: 1) The reissued LE certificates use reuse-key with certbot command so the key is not changed. 2) The selector key of the TLSA record (second digit) is 1 (for subject public key) and not 0 (for full certificate).

So 3-0-1 and 3-0-2 DANE TLS records should be avoided. Information at https://community.letsencrypt.org/t/please-avoid-3-0-1-and-3-0-2-dane-tlsa-records-with-le-certificates/7022

Virtualmin writes 3-0-1 records. Below is a fix to change to 3-1-1 records. I have confirmed this works by disabling domain TLSA DNS, reenabling it and testing from site https://dane.sys4.de/ (after a delay).

These code changes have nothing to do with setting up an addiitonal TLSA DNS key for key rotation. If a key is changed then the TLSA DNS records need to be reissued (for example by disabling and enabling as above). There will be a small amount of time when records will be out of sync.

Check --resuse-key is in use from Webmin, Webmin, Webmin Configuration, Cog (for module configuration), Next, Re-use existing LE keys?, Yes, Save

Edit function create_tlsa_dns_record in file webmin/virtual-server/feature-dns.pl.

Change lines:

    "openssl x509 -in ".quotemeta($temp)." -outform DER 2>/dev/null | ".
    "openssl sha256 2>/dev/null");

to:

    "openssl x509 -in ".quotemeta($temp)." -pubkey -outform DER 2>/dev/null | ".
    "openssl rsa -pubin -outform DER 2> /dev/null | openssl sha256 2>/dev/null");

Change line:

     'values' => [ 3, 0, 1, $1 ] };

to

     'values' => [ 3, 1, 1, $1 ] };

johnheenan commented 6 months ago

So, why does this work? The main purpose of TLSA DNS is to gain independence from CAs (Certificate Authorities), such as LE (Let's Encrypt) and others. The subject public key part of an x509 CA issued certificate is originally issued by the private key being certified and as long as the private key does not change, the subject public key does not change. The entire x509 certificate issued by a trusted CA that can be confirmed through the signature. But we don't need it.

So why not just extract the subject public key from the private key?

It is simpler but there are a number of problems with this approach:

openssl rsa -in ssl.key -pubout -outform DER | openssl sha256

1) How do we know if rsa was used? x509 explicilty tells us the algorthm and the number of bits. 2) Virtualmin stores the location of the x509 file in its domain config, not the private key file ssl.key

johnheenan commented 6 months ago

The fix above has an issue. The x509 certifcate should be checked for the public key algorithm before assuming rsa output to the pipe.

jcameron commented 6 months ago

Does this only work for Let's Encrypt? We also need to support the case where the user uploads a cert from another CA ..

johnheenan commented 6 months ago

Does this only work for Let's Encrypt?

I have not tried other CAs.

How far back does the DANE protocol go up the chain trust hierarchy before it refuses to trust a SMTP/TLS certificate issued by a CA and presented by a SMTP mail server? Does DANE cache results if it does go up a trust hierarchy?

It is up to intermediate and root CAs if they choose to publish TLSA DNS records for their certificates for DANE to check, if it chooses.

In principle I suppose DANE can and should work with self signed certificates. If so, then it does not matter which CA is used. With a 3 1 1 TLSA DNS record the CA issued certificate is not being authenticated, only the subject public key, which is included in the certificate and can be independently checked and verified with DNSSEC.

DANE is capable of saying "we don't care if you use self signed certs or CAs for SMTP/TLS if you use 3 1 1 TLSA DNS records"

But does DANE care? If it does care, do CA's care and so issue TLSA DNS records for their certificates?

How aboout giving Virtualmin users an option to generate either 3 0 1 certifcates (as occurs now) or generate 3 1 1 certifcates, until more is known.

Another issue, even if DANE works with self signed certs and CAs other than LE, DANE survey sites may fault sites for using self signed certs or CAs that don't meet some additional standard they consider.

johnheenan commented 6 months ago

TL/DR to above: If someone reports that DANE works with self signed certicates then DANE and the fix above should work fine with other CAs than LE.

johnheenan commented 6 months ago

These are my results changing a top level .com domain from using LE to self signed in Virtualmin

1) Four Virtualmin certificate files were updated (ssl.cert, ssl.combined, ssl.everything, ssl.key). File ssl.ca was not updated. 2) The TLSA DNS records in Virtualmin were updated 3) DNSSEC validation site https://dnsviz.net was happy with a re-analysis 4) DANE SMTP validaton site https://dane.sys4.de/ was happy with re-validation. 5) DNSSEC and DNS DANE TLS adoption survey site at https://stats.dnssec-tools.org/explore/ takes several days to update.

So my conclusion is that any CA will work. As to whether the survey site is happy, I don't know yet.

jcameron commented 6 months ago

So the current code is actually adding both the cert and the CA to the input file to the command openssl x509 -in ".quotemeta($temp)." -outform DER 2>/dev/null , as seen here :

https://github.com/virtualmin/virtualmin-gpl/blob/master/feature-dns.pl#L4709

Would an even simpler fix to be not add the CA cert to that file, and just generate a 3 1 1 record?

johnheenan commented 6 months ago

The file referenced is the ssl.cert file and it only contains one certificate: the single x509 certificate from the CA, which is just a signed envelope of the original subject public key for the domain and other infomation.

Currently the entire x509 certifcate file DER output is hashed to produce the 3 0 1 TLSA DNS record.

DANE does not need this to authenticate. DANE is happy to just work with a hash of the subject public key to form a 3 1 1 TLSA DNS record. Virtualmin can get this from the x509 certifcate from the CA in ssl.cert OR from the private key in ssl.key. I showed both methods above.

The second method is simpler in principle and more robust but requires more code gymastics than a simpler fix as proposed.

Virtualmin can also get the subject public key from an x509 self signed certificate.

johnheenan commented 6 months ago

There is another issue. Both methods require the algorthm to be known. The x509 certifcate includes the algorthm used for the subject public key. What if someone uploads certifcates not generated by Virtualmin?

To avoid using the x509 CA certicate, I suppose the private key file could be exit code tested for valid rsa key algorthm output use and if this fails then tested for ellipctic curve key algorithm use.

johnheenan commented 6 months ago

When DANE wants to validate SMTP domain authenticty it examines DNS to see if there is a TLSA DNS record for the service. If so it examines if the record is a 3 0 [12] type or 3 1 [12] type.

If it is a 3 0 1 type then DANE hashes the DER output from entire x509 certifcate provided by the smtp server and compares.

If it is a 3 1 1 type then DANE hashes the DER output from the extracted subject public key included in the x509 certifcate provided by the smtp server and compares.

It is up to DANE if it wants to go further up the hierarchy and validate signer certifcates through TLSA DNS. It looks like it does not care to do so for standard DANE. I am waiting to see if the DANE survey site is more strict.

chris001 commented 6 months ago

DANE doesn't need to go up the DNS hierarchy, because the all the way up the DNS hierarchy is already protected by DNSSEC. It's impossible for a bad actor to slip his counterfeit certificate's hash into the DNS records without breaking DNSSEC signatures, causing the domain to fail to resolve thru DNS, red flag, ideally a monitor would notify the domain owner with an instant notification saying Signature Failed on a DNS record.

johnheenan commented 6 months ago

I meant the certifcate chain hierarchy, not the DNS hierarchy.

Below is an image from the DANE survey site for a domain I control:

Examining the second record, the survey site has gone up the certifcate hierarchy, generated the hash for CA certificate with name R3 (which is Let's Encrypt) and appears to have matched it with a DNS TLSA record (from where?). The CA certificate is valid for five years up to 2025.

For self-signed certifcates the hierarchy is from mail subdomain to main domain. I am waiting for a new report for a self signed certficate from the DANE survey site to see what it says.

I suppose if one wants to know if a CA is compatible with DANE then it should be sufficient for the CA to state it is, meaning DANE can verify their certifcates through DNSSEC.

As to whether DANE does or not check chain certifcates or can be confgured to or not, I don't know, All I can say is that the DANE survey site appears to be checking.

johnheenan commented 6 months ago

I know the SMTP server will present two packaged certifcates from a single file: a CA cert and a domain cert signed by the CA. But why does the DANE survey site choose not to ignore the CA cert?

We can state DANE does not need to go up any hierarchy. True. But if someone is using a CA to sign certifcates (instead of self signing) and DANE has the capaicity to check CA cerificates with DNSSEC, does this not provide an extra layer of security?

johnheenan commented 6 months ago

With regard to specific issues raised for Virtualmin, my view is that until more is known and trialled, Virtualmin should offer a choice between 3 0 1 type TLSA DNS records (as now) and 3 1 1 records, with a default to 3 1 1 records since it will solve known problems for most use cases (using LE).

Adding the following to lillustrate DANE is not just 'bye bye CAs' for SMTP.

Verisign is a CA. Following is from the end of a 2015 Verisign Blog https://blog.verisign.com/security/how-dane-strengthens-security-for-tls-smime-and-other-applications/

In short, DANE provides the ability to use DNSSEC to perform the critically important function of secure key learning and verification. It can use the DNS directly to distribute and authenticate certificates and keys for endpoints. It can also work in conjunction with today’s public CA system by applying additional constraints about which CAs are authorized to issue certificates for specific services or users – thereby significantly reducing risks in the currently deployed CA system.

Trust in CAs is not high for good reason. It look likes Verisign sees DANE as been able to pick and choose which CAs to use for which services, if any.

In addition, who would trust any specific organisation to self manage their own certifcation with self signing?

In addition with DNSSEC, which DANE relies upon, ultimate trust lies with a single registrar DS record. This is arguably worse than trusting a CA and is a massive point of failure that can cripple functionality.

johnheenan commented 5 months ago

Getting some of the mystery cleared up (I hope). Unfortunately, there is no clear source of information and I am basing this on interpretation of a cryptic post at https://community.letsencrypt.org/t/dane-and-upcoming-le-issuer-certs/134172, particularly this statement (but in context):

TL;DR If you're using DANE-TA(2) (certificate usage 2) TLSA records with Let's Encrypt cert chains, you need to augment the TLSA RRSet with additional digests for the upcoming "R3" and/or "E1" issuer CAs.

My impression is 3 1 1 TLSA DNS records are suitable for self signed certs and CA signed certs but do not take any included CA cert into consideration for anything.

To require a CA cert to be taken into consideration then the TLSA DNA record must start with 2 instead of 3. In this case the CA cert must be included with the CA signed cert AND the CA cert TLSA DNS hash digest records must be published in the same zone file as for the domain TLSA DNS records.

RRSet records are just DNSSEC signatures for other DNS records.

chris001 commented 5 months ago

These might provide some answers:

SMTP Security by Opportunistic DANE TLS. The DANE protocol: Updates and Operator Guidance. Using DANE TLSA records with SRV records.

johnheenan commented 5 months ago

These might provide some answers:

SMTP Security by Opportunistic DANE TLS. The DANE protocol: Updates and Operator Guidance. Using DANE TLSA records with SRV records.

Thanks @chris001.

These are interesting from RFC7671

(3) or (2) below refers to value of first digit in DNS TLSA record.

Start of section 5.2

This section updates [RFC6698] by specifying a new operational requirement for servers publishing TLSA records with a usage of DANE-TA(2): such servers MUST include the TA certificate in their TLS server certificate message unless all such TLSA records are "2 0 0" records that publish the server certificate in full.

Ah ha! Looks like I got it right in post above!

TA means Trust Anchor. Just read as CA.

End of section 5.1

While DANE-EE(3) TLSA records are expected to be by far the most prevalent, as explained in Section 5.2, DANE-TA(2) records are a valid alternative for sites with many DANE services. Note, however, that virtual hosting is more complex with DANE-TA(2). Also, with DANE-TA(2), server operators MUST ensure that the server is configured with a sufficiently complete certificate chain and need to remember to replace certificates prior to their expiration dates.

I am pleased that Virtualmin is capable of easily implementing DANE-TA(2)

DANE-TA(2) appears to be DNSSEC hash digest verified of two certificates: (CA cert) + (CA signed domain cert) with both certifcates made available by a service (such as SMTP).

Start of section 5.1

Authentication via certificate usage DANE-EE(3) TLSA records involves simply checking that the server's leaf certificate matches the TLSA record. In particular, the binding of the server public key to its name is based entirely on the TLSA record association. The server MUST be considered authenticated even if none of the names in the certificate match the client's reference identity for the server. This simplifies the operation of servers that host multiple Customer Domains, as a single certificate can be associated with multiple domains without having to match each of the corresponding reference identifiers.

So no mention of a CA (TA) here.

The following is not relevant, unless you find the terminology exasperating (99% of us?)

My experience of RFCs is that they are incomprehensible rushed post conference after thoughts that only make sense to those who have been working together on a code base and agree a code implementation should drive a standard. That is an RFC standard does not drive original code and is not a request for comment (RFC). They are comments on extracted code comments by those who believe their code implementation is up to a standard. Hence thay have all the pain of discovered corner cases and are all about viewing trees, when we want is to view the forest. Hence, a big issue with RFCs is unclear and assumed teminology that lacks definition, usage of language that never became mainstream and later attempts with new RFCs to reconcile with an invented industry language that everyone imagined was in an RFC but wasn't. Although with DANE, there is no big industry to have evolved its own language.

So, if you want to implement an RFC in your favourite language, port the orginal implementation.

johnheenan commented 5 months ago

I am waiting for a new report for a self signed certficate from the DANE survey site to see what it says.

The DANE survey site results for a site with a self signed cert is in. The DANE survey site is happy. Survey site is https://stats.dnssec-tools.org/explore/

The only difference is that the SMTP/TLS certicates tab only showed one certifcate hash digest instead of two, as noted above for a LE signed domain certifcate.

Out of curiosity, I examined the Virtualmin postfix /etc/postfix/sni_map file. All line entries include entries for ssl.key and and ssl.cert files. Most also include an entry for a ssl.ca file. Some lines for domains that used LE were missing an entry for ssl.ca. I don't know if this is important, just noting it

The entry for the tested self signed domain did not list a ssl.ca file.

Maybe it is best to assume that entries in the SMTP/TLS certicates tab of the DANE survey site do nothing more than form digest hash versions of any certifcates offered to them by the SMTP server, independent of whether an offered CA files is used by DANE: yes for TA(2), no for EE(3).

johnheenan commented 5 months ago

Some recent comments (Sept 2022) from https://community.letsencrypt.org/t/understanding-smtp-dane-implementation-options/184274

Let's ping @ietf-dane here, who is an expert in this field.

Quotes from response at https://community.letsencrypt.org/t/understanding-smtp-dane-implementation-options/184274/4: ...

I recommend against attempting to use the root CA public key hash as a stable fire and forget TLSA record. Even the root CA used by Let's Encrypt will eventually change, and you yourself might stop using Let's Encrypt, ...

...

So to the question of what is best practice. It is indeed "3 1 1 + 3 1 1", where you configure Let's Encrypt to not change your key during regular certificate renewals (set reuse_key = true in the renewal .conf file), and arrange to inject a new key for certbot to use, only after that key's hash has been been published in DNS for at least a few TTLs in advance.

That way, if you forget to update the TLSA records, the certificate renewal process just keeps using the same key that already works, you keep obtaining new certs (for all those non-DANE clients to verify), and everything just works...

The main point I take to take from this is that moving from a 3 0 1 DNS hash digest record to a 3 1 1 record, as proposed with the three line fix, is best practice.

For 3 1 1 record, DANE ignores any signing certificate (self signed, CA, up to date or not). With a 3 1 1 record DANE only pays attention to the original key (the subject public key part), which does not need to change when a key is resigned.

DANE is used for authenticating SMTP for non relay clients. An email client can choose to ignore DANE and pass a message back that a signing cert is out of date or is not trusted. The three line fix does not alter any existing practices in this regard.

Lawkss commented 2 months ago

Verdict: At least one of your mail server domains does not have an active DANE scheme for a reliable rollover of certificate keys.

Technical details: Mail server (MX) DANE rollover scheme mail.vom-bruch.com. no Test explanation: We check if there is an active scheme with at least two DANE TLSA records to reliably handle certificate rollovers on your receiving mail servers (MX).

Such a scheme will be proven useful when there is a need to update your mail server certificate(s). It can prevent that DANE becomes invalid during the transition period which could endanger mail deliverability at your domain. A rollover scheme could but does not need to be 'active' all the time.

We recommend you to apply one of the following two schemes with double DANE TLSA records:

Current + Next ("3 1 1" + "3 1 1"): Publish two "DANE-EE(3) SPKI(1) SHA2-256(1)" records, one for the current and one for the next TLS certificate of your mail server. Current + Issuer CA ("3 1 1" + "2 1 1"): Publish a "DANE-EE(3) SPKI(1) SHA2-256(1)" record for the current TLS certificate of your mail server, and also a "DANE-TA(2) SPKI(1) SHA2-256(1)" record for the current root or intermediate certificate of the (not necessarily public) certificate authority. The test also accepts other combinations of DANE TLSA records i.e. "3 x x" + "3 x x" or "3 x x" + "2 x x". However other types than the above will generally be more error prone and less interoperable.

Any chance we will get the rollover scheme implemented? <3

I get this alert when the records are updated but DNS is napping:

[ant-dnssec-operators@ant.isi.edu](mailto:ant-dnssec-operators@ant.isi.edu). To adjust the list of email contacts for your domains, just reply to this message with the desired corrections.

About the survey: https://stats.dnssec-tools.org/about.html ]

The TLSA RRsets of some of your email servers do not match their actual certificate chains. Issue details for the affected domains:
vom-bruch.com
can be seen at:
https://stats.dnssec-tools.org/explore/?vom-bruch.com

[ Perhaps consider: <https://github.com/tlsaware/danebot>? ]
The issues can be resolved by removing or updating the associated DNS DANE TLSA records.
- "3 0 [12]" vs. Let's Encrypt:
  https://community.letsencrypt.org/t/please-avoid-3-0-1-and-3-0-2-dane-tlsa-records-with-le-certificates/7022/17

- Best practice "3 1 1" rollover methodology:
  https://mail.sys4.de/pipermail/dane-users/2018-February/000440.html

- Monitoring code snippet:
  https://list.sys4.de/hyperkitty/list/dane-users@list.sys4.de/thread/NKDBQABSTAAWLTHSZKC7P3HALF7VE5QY/

johnheenan commented 2 months ago

Rollover, as in post above, is more icing on the cake to make a potential problem of a few minutes go away and is a distraction from the real issue.

Since I put my three line code fix in place at the top (a move from 3 0 1 to 3 1 1), I have so far had no further problems. No rollover certificate, just a straight one for one replacement, better than no repacement, which is the real problem.

johnheenan commented 2 months ago

As long as the private key does not change and --resuse-key is used (see top) then a simple rollover might be just to leave the previous DNS TLSA record in place with the new one alongside. The start and end dates will overlap. The previous to the current previous DNS TLSA record can be deleted, if it exists.

I don't see why this cannot be included as part of whatever rollover scheme is already in place with virtualmin, assuming there is one instead of straight one for one replacment.

johnheenan commented 2 months ago

Is rollover needed? I don't think it is for DANE use if a '3 1 1' scheme is in use and the private key does not change.

The reason for this is that for any reissued TLS certificate, the external certification is not used for the DNS TLSA record with a '3 1 1' scheme. Only the private key certification of the 'subject name' (the DNS name in practice) is used.

Since the previous private key is used with --reuse-key and a '3 1 1' scheme, no rollover is required. It does not matter if any DNS query finds a new or previous DNS TLSA record, as long as the dates overlap.

Of course if the private key does change or if the DANE scheme consults the full TLS certificate with a '3 0 1' scheme then rollover is best practice. One of the main points of DNS TLSA is to get away from excessive reliance on externally certified TLS certificates. That is, DNS allows uses of self certification.

virtualmin / virtualmin-gpl

Three line fix for DANE TLSA DNS records going out of date every 90 days with LE #803