cabforum / definitions

Repository for documentation produced by the Definitions and Glossary Working Group
0 stars 0 forks source link

Authorization Domain Names 😬 #5

Open clintwilson opened 1 month ago

clintwilson commented 1 month ago

During a discussion in the Validation Subcommittee on April 18th, a small side-discussion occurred related to Authorization Domain Names (ADNs) and what is fully implied within an Authorization Domain Name. I agreed to write up and expand on that discussion, which follows below.

Authorization Domain Names are an incredibly useful concept, but they also contain somewhat hidden complexity. I believe how this term is currently interpreted and used by Certificate Consumers and Certificate Issuers is worth somewhat careful consideration. I would hypothesize that there would likely be differences found in such consideration, along with components of the TBRs that could be updated to better convey the resultant consensus of what this concept means. That said, there is also the potential for something along the lines of "malicious misinterpretation" when reviewing this definition which should also be accounted for and avoided.

The full definition is:

Authorization Domain Name: The FQDN used to obtain authorization for a given FQDN to be included in a Certificate. The CA may use the FQDN returned from a DNS CNAME lookup as the FQDN for the purposes of domain validation. If a Wildcard Domain Name is to be included in a Certificate, then the CA MUST remove "*." from the left-most portion of the Wildcard Domain Name to yield the corresponding FQDN. The CA may prune zero or more Domain Labels of the FQDN from left to right until encountering a Base Domain Name and may use any one of the values that were yielded by pruning (including the Base Domain Name itself) for the purpose of domain validation.

Breaking this apart a bit, this means that an ADN is:

  1. The FQDN used to obtain authorization for a given FQDN to be included in a Certificate.

    • We start with the concept of an FQDN to be included in a Certificate.
    • This component of the definition, which seems to me to be the "core" of the definition, does not account for the fact that there may not be an FQDN included in the Certificate if only Wildcard Domain Names are present. Due to the use of FQDN to refer to multiple, different values within the same concept of the Authorization Domain Name, the later clarifications around the use of Wildcard Domain Names in determining an ADN don't provide absolute clarity to this initial, core sentence.
    • I hope that it's safe to assume this is sufficiently clear in actual practice, even if there exists a potential for it to be understood differently.
    • Both FQDN and the defined term associated with this acronym, Fully-Qualified Domain Name, are used throughout the TBRs and are exactly equivalent.
    • The definition for Fully-Qualified Domain Name is

      Fully-Qualified Domain Name: A Domain Name that includes the Domain Labels of all superior nodes in the Internet Domain Name System.

    • And the definitions from this definition:
      • Domain Name:

        Domain Name: An ordered list of one or more Domain Labels assigned to a node in the Domain Name System.

      • Domain Label:

        Domain Label: From RFC 8499 (http://tools.ietf.org/html/rfc8499): "An ordered list of zero or more octets that makes up a portion of a domain name. Using graph theory, a label identifies one node in a portion of the graph of all possible domain names."

      • Domain Name System is not a defined term
    • Next, there is a need to obtain authorization for this FQDN.
    • This is referencing the need to perform at least one of the processes outlined in Section 3.2.2.4 "Validation of Domain Authorization or Control"
    • One interpretation of this might conclude that there exists some difference(s) between the various methods described in Section 3.2.2.4, where some validate Domain Authorization and others validate Domain Control, which considered in the context of the ADN definition could further lead to a conclusion that only certain methods in Section 3.2.2.4 will have an Authorization Domain Name.
      • I don't believe these conclusions to be well-supported, but I will note that some methods in Section 3.2.2.4 do seem to very intentionally exclude the concept of ADN, so it seems at least possible to me that somewhere along the line there was some intrinsic property of Section 3.2.2.4 methods related to whether each method provided for either validation of Domain Authorization OR Domain Control.
    • Finally, we map back to another FQDN which is directly equivalent to the concept of the ADN; that is, whatever FQDN ends up being used to complete one of Section 3.2.2.4's validation methods becomes the ADN.
    • Notably, in my mind, the ordering of this seems to indicate that there isn't necessarily an expectation that the ADN be selected by the CA or Applicant prior to completing the Validation of Domain Authorization or Control
  2. The CA may use the FQDN returned from a DNS CNAME lookup as the FQDN for the purposes of domain validation.

    • This sentence introduces an expansion to the methodology by why an ADN can be determined, specifically by performing a DNS CNAME Lookup.
    • We learn a couple new things about what an ADN may constitute:
    • A CNAME record can be retrieved as part of determining the ADN; and
    • The FQDN present in the RDATA section of the retrieved resource record can be used as the FQDN which is validated in accordance with Section 3.2.2.4
    • Presumably the second use of FQDN in this sentence is referring to the retrieved domain-name in the CNAME record as the FQDN which becomes the ADN, rather than the FQDN that's to be included in the Certificate.
    • This means that:
      • the FQDN used to complete one of Section 3.2.2.4's validation methods can appear to be entirely different from the FQDN that is included in the Certificate; and
      • once a CNAME record has been retrieved for a given FQDN, the domain-name in the RDATA must be used directly as the FQDN which is validated in accordance with Section 3.2.2.4
    • It could also, possibly, be argued that this phrasing means that CNAME chaining is disallowed for the purposes of identifying an Authorization Domain Name.
      • The RDATA of the CNAME query must, by definition, return an FQDN, but nothing here seems to imply that it's acceptable to then perform a second DNS query for that returned FQDN, following the theoretical alias set in that CNAME record to a secondary returned FQDN.
    • This definition doesn't specify what must be used as input to the "DNS CNAME lookup", how this value is selected, or what (and whether) specific criteria exist for selecting the value.
    • I hope it's safe to assume that there is absolute consensus that the NAME value used as input to the DNS query made when retrieving a CNAME record in the context of selecting an Authorization Domain Name must be:
      • the FQDN to be included in the Certificate;
      • the FQDN of a Wildcard Domain Name (as determined based on the pruning logic described in this definition) to be included in the Certificate; or
      • a parent domain (as determined based on the pruning logic described in this definition) of the FQDN to be included in the Certificate
  3. If a Wildcard Domain Name is to be included in a Certificate, then the CA MUST remove "*." from the left-most portion of the Wildcard Domain Name to yield the corresponding FQDN.

    • This sentence provides additional specification for when a Wildcard Domain Name is included in the Certificate instead of the FQDN indicated by the first sentence of the ADN definition.
    • The definition of Wildcard Domain Name is:

      A string starting with "*." (U+002A ASTERISK, U+002E FULL STOP) immediately followed by a Fully-Qualified Domain Name.

    • While the definition of ADN doesn't specify the Unicode characters that are to be removed, in combination with the definition of Wildcard Domain Name it is clear that the removal process described here results in the exact Fully-Qualified Domain Name which follows the "*." (U+002A ASTERISK, U+002E FULL STOP) that start the Wildcard Domain Name.
    • The corresponding FQDN here is analogous to the "FQDN to be included in a Certificate" described in the first sentence, and not the "FQDN used to obtain authorization".
    • Because of this, we can treat Wildcard Domain Names the same as FQDNs in the process of performing DNS CNAME Lookups because, for the purposes of determining an ADN, the Wildcard Domain Name is just the FQDN that's appended to the "wildcard" Domain Label of the Wildcard Domain Name.
  4. The CA may prune zero or more Domain Labels of the FQDN from left to right until encountering a Base Domain Name and may use any one of the values that were yielded by pruning (including the Base Domain Name itself) for the purpose of domain validation.

    • The sentence gives an additional mechanism for arriving at the FQDN that can be validated in accordance with Section 3.2.2.4, thus becoming an Authorization Domain Name.
    • The first identified input to this sentence is an FQDN. The origin of this FQDN is left unspecified.
    • I hope it's once again safe to assume that there is absolute consensus that the FQDN used as input to the pruning 'algorithm' in the context of selecting an Authorization Domain Name must be:
      • the FQDN to be included in the Certificate; or
      • the FQDN of a Wildcard Domain Name (as determined based on the pruning logic described in this definition) to be included in the Certificate.
    • Once the input FQDN is identified, the CA is to remove Domain Labels from the FQDN.
    • The ordering specified here ultimately ensures that the value resulting from the process always remains a valid FQDN. This isn't stated explicitly as an outcome, but should nonetheless be included in whatever logic CA systems have around determining whether an ADN is valid.
    • Domain Labels can only be removed until the result of removing a Domain Label is a Base Domain Name.
    • The definition for Base Domain Name is:

      Base Domain Name: The portion of an applied-for FQDN that is the first Domain Name node left of a registry-controlled or public suffix plus the registry-controlled or public suffix (e.g. "example.co.uk" or "example.com"). For FQDNs where the right-most Domain Name node is a gTLD having ICANN Specification 13 in its registry agreement, the gTLD itself may be used as the Base Domain Name.

    • This definition reveals that a Base Domain Name has its own set of complexities, but ultimately is allowed to be:
      • A TLD prepended by a single node representing the DNS Name value registered within that TLD's official registry;
      • A TLD prepended by one or more nodes representing a public suffix, all of which is then prepended by a single node representing the DNS Name value registered within that TLD's official registry; or
      • A single node representing the TLD where the TLD is classified by ICANN as a gTLD where the registry agreement of the registry operator of the gTLD includes ICANN Specification 13.
    • This brings into the context of an Authorization Domain Name the need to properly determine Base Domain Name values, which vary depending on the FQDN used as input to this pruning 'algorithm'.
    • Of particular note regarding Base Domain Names is that what qualifies as a "public suffix" is underspecified, not only within the TBRs (as noted in Section 3.2.2.6) but also within the Domain Name System in general. That is, public suffixes are not a property of DNS so their use in CA processes may not be entirely consistent across the ecosystem.
    • Of further noteworthiness, the definition of Base Domain Name includes in its set of valid outputs all public suffixes.
      • I highlight this because, to the best of my knowledge, despite this aspect of the TBRs being essentially based on accepted expectations rather than super-specific rules and requirements, there aren't widespread failures of CAs adhering to the generally expected use of the PSL. I believe this is praiseworthy (and I hope by pointing it out I'm not immediately proven incorrect 😅)
    • Any value found during the pruning process, including the Base Domain Name, may then be used as the FQDN which is validated in accordance with Section 3.2.2.4.
    • While the definition of Fully-Qualifed Domain Name states that it includes "the Domain Labels of all superior nodes", this doesn't conflict with the definition of Base Domain Name nor this definition's use of Base Domain Name, because where the Base Domain Name is a gTLD there are no superior nodes for which their Domain Labels would also need to be included.

With this (rather thoroughly overcooked) breakdown of the ADN definition itself, some additional observations stand out to me:

  1. The only input to the validation processes defined in Section 3.2.2.4 is an FQDN
  2. The use of multiple types of FQDNs in the ADN definition is not perfectly clear, but also fairly logical when carefully reviewed
  3. The use of FQDN or Fully-Qualified Domain Name anywhere else in the TLS Baseline Requirements warrants some additional scrutiny, because the definition of ADN draws an equivalency in that what the ADN ends up being is an FQDN.
    • For example, 3.2.2.4.18 indicates that it is confirming control of an FQDN (and only a single FQDN), however it also allows for the use of an ADN within the process.
  4. The use of Authorization Domain Name, especially throughout Section 3.2.2.4, is more nuanced than the text itself fully implies.
  5. When viewing a dNSName value in a Certificate, it is not possible to determine what ADN was used when validating the dNSName value, with (I believe) the singular exception of a gTLD being included, having only one possible ADN. There is little to no transparency currently available regarding what ADNs are used to validate the dNSName values included in TLS Certificates.
timfromdigicert commented 1 month ago

Thanks for this excellent and detailed summary, I didn’t read anything I disagreed with (but might have missed a detail or two).

The most common misunderstanding I hear from other CAs is the idea that the ADN is unique and/or there is an algorithm to determine it, when in reality the CA has some discretion here to choose a “convenient” ADN. And that this is by design.

The current language is somewhere between unclear and mildly misleading, due to the use of the passive voice: “The FQDN used to obtain authorization for a given FQDN to be included in a Certificate.” The definite article tends to imply there’s only one, when the truth is that there’s only one which is “the” one THAT IS USED BY THE CA, among a variety of potential choices.

It perhaps would be clearer as something like:

“The FQDN chosen by the CA to attempt to obtain authorization for a given FQDN or Wildcard Domain Name to be included in a Certificate.”

-Tim

From: Clint Wilson @.> Sent: Thursday, May 16, 2024 10:07 AM To: cabforum/servercert @.> Cc: Subscribed @.***> Subject: [cabforum/servercert] Authorization Domain Names 😬 (Issue cabforum/definitions#5)

During a discussion in the Validation Subcommittee on April 18th, a small side-discussion occurred related to Authorization Domain Names (ADNs) and what is fully implied within an Authorization Domain Name. I agreed to write up and expand on that discussion, which follows below.

Authorization Domain Names are an incredibly useful concept, but they also contain somewhat hidden complexity. I believe how this term is currently interpreted and used by Certificate Consumers and Certificate Issuers is worth somewhat careful consideration. I would hypothesize that there would likely be differences found in such consideration, along with components of the TBRs that could be updated to better convey the resultant consensus of what this concept means. That said, there is also the potential for something along the lines of "malicious misinterpretation" when reviewing this definition which should also be accounted for and avoided.

The full definition is:

Authorization Domain Name: The FQDN used to obtain authorization for a given FQDN to be included in a Certificate. The CA may use the FQDN returned from a DNS CNAME lookup as the FQDN for the purposes of domain validation. If a Wildcard Domain Name is to be included in a Certificate, then the CA MUST remove "*." from the left-most portion of the Wildcard Domain Name to yield the corresponding FQDN. The CA may prune zero or more Domain Labels of the FQDN from left to right until encountering a Base Domain Name and may use any one of the values that were yielded by pruning (including the Base Domain Name itself) for the purpose of domain validation.

Breaking this apart a bit, this means that an ADN is:

  1. The FQDN used to obtain authorization for a given FQDN to be included in a Certificate.

Fully-Qualified Domain Name: A Domain Name that includes the Domain Labels of all superior nodes in the Internet Domain Name System.

Domain Name: An ordered list of one or more Domain Labels assigned to a node in the Domain Name System.

Domain Label: From RFC 8499 (http://tools.ietf.org/html/rfc8499): "An ordered list of zero or more octets that makes up a portion of a domain name. Using graph theory, a label identifies one node in a portion of the graph of all possible domain names."

  1. The CA may use the FQDN returned from a DNS CNAME lookup as the FQDN for the purposes of domain validation.
  1. If a Wildcard Domain Name is to be included in a Certificate, then the CA MUST remove "*." from the left-most portion of the Wildcard Domain Name to yield the corresponding FQDN.

A string starting with "*." (U+002A ASTERISK, U+002E FULL STOP) immediately followed by a Fully-Qualified Domain Name.

  1. The CA may prune zero or more Domain Labels of the FQDN from left to right until encountering a Base Domain Name and may use any one of the values that were yielded by pruning (including the Base Domain Name itself) for the purpose of domain validation.

Base Domain Name: The portion of an applied-for FQDN that is the first Domain Name node left of a registry-controlled or public suffix plus the registry-controlled or public suffix (e.g. "example.co.uk" or "example.com"). For FQDNs where the right-most Domain Name node is a gTLD having ICANN Specification 13 in its registry agreement, the gTLD itself may be used as the Base Domain Name.

With this (rather thoroughly overcooked) breakdown of the ADN definition itself, some additional observations stand out to me:

  1. The only input to the validation processes defined in Section 3.2.2.4 is an FQDN
  2. The use of multiple types of FQDNs in the ADN definition is not perfectly clear, but also fairly logical when carefully reviewed
  3. The use of FQDN or Fully-Qualified Domain Name anywhere else in the TLS Baseline Requirements warrants some additional scrutiny, because the definition of ADN draws an equivalency in that what the ADN ends up being is an FQDN.
  1. The use of Authorization Domain Name, especially throughout Section 3.2.2.4, is more nuanced than the text itself fully implies.
  2. When viewing a dNSName value in a Certificate, it is not possible to determine what ADN was used when validating the dNSName value, with (I believe) the singular exception of a gTLD being included, having only one possible ADN. There is little to no transparency currently available regarding what ADNs are used to validate the dNSName values included in TLS Certificates.

— Reply to this email directly, view it on GitHub https://github.com/cabforum/definitions/issues/5 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AIFREHBVYNWEXZKQ4VFDKZ3ZCS4O5AVCNFSM6AAAAABH2JUWMKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYDANJRGY2TANA . You are receiving this because you are subscribed to this thread.Message ID: @.***>