publicsuffix / list

The Public Suffix List
https://publicsuffix.org/
Mozilla Public License 2.0
1.97k stars 1.2k forks source link

Removing wildcard for cloudapp.azure.com #1944

Closed edwa001 closed 5 months ago

edwa001 commented 5 months ago

Public Suffix List (PSL) Pull Request (PR) Template

Each PSL PR needs to have a description, rationale, indication of DNS validation and syntax checking, as well as a number of acknowledgements from the submitter. This template must be included with each PR, and the submitting party MUST provide responses to all of the elements in order to be considered.

Checklist of required steps

Submitter affirms the following:


For Private section requests that are submitting entries for domains that match their organization website's primary domain, please understand that this can have impacts that may not match the desired outcome and take a long time to rollback, if at all.

To ensure that requested changes are entirely intentional, make sure that you read the affectation and propagation expectations, that you understand them, and confirm this understanding.

PR Rollbacks have lower priority, and the volunteers are unable to control when or if browsers or other parties using the PSL will refresh or update.

(Link: about propagation/expectations)

Description of Organization

Microsoft Azure is the world's computer. Microsoft empowers every individual and every organization on earth to achieve more.

Organization Website:

https://www.microsoft.com/

Reason for PSL Inclusion

NOTE: This is a modification of a previously included domain that was added in an incorrect format.   We seek to have segmentation of these namespaces to appropriately isolate cookie and other divisions within browsers and applications. Similar to prior requests made by or on behalf of the Corporate Domains division at Microsoft, these requests contribute to the stable, secure and resilient operation of Azure on behalf of its customers and partners.

Number of users this request is being made to serve:

Azure currently supports millions of users.

DNS Verification via dig

Results of Syntax Checker (make test)

ciaranj commented 4 months ago

This change appears to have broken letsencrypt usage against virtual machines created automatically within Microsoft Azure. These machines receive addresses of the form < MACHINE_NAME >.< REGIION eg. uksouth >.cloudapp.azure.com. My understanding of the public list suffix is that there should be a set of domains in the list e.g. uksouth.cloudapp.azure.com, as these are the level at which subdomains are operated by unaffiliated organizations?

simon-friedberger commented 4 months ago

@ciaranj Sounds correct, I don't know why the initial request was made so I cannot comment on the intent. cc @edwa001

ciaranj commented 4 months ago

@simon-friedberger I'm unsure what the process for submitting a PR is. For example, how was @edwa001 verified? The account appears to have existed solely for the creation of this PR, it worries me slightly that this could be malicious (I'm not entirely sure what the impact would be on partitioned cookies for example.) /edited for tone./

dnsguru commented 4 months ago

I verified @edwa001 was in fact a Microsoft employee and making the request on behalf of Microsoft Corporation and Azure by speaking with him and another person in their corporate domains division. The requestor was well-vetted.

edwa001 commented 3 months ago

This change appears to have broken letsencrypt usage against virtual machines created automatically within Microsoft Azure. These machines receive addresses of the form < MACHINE_NAME >.< REGIION eg. uksouth >.cloudapp.azure.com. My understanding of the public list suffix is that there should be a set of domains in the list e.g. uksouth.cloudapp.azure.com, as these are the level at which subdomains are operated by unaffiliated organizations?

@ciaranj Is it still broken? We were aware of some issues generating certificates, so we completely removed cloudapp.azure.com. from the PSL in https://github.com/publicsuffix/list/pull/1966

ciaranj commented 3 months ago

@edwa001 Ah. Unfortunately, I took the advice from letsencrypt and just completed moving everything over to our own (azure managed) domain today, as the approach didn't seem meaningfully more difficult in the end. I'll re-test the old approach and let you know. Thank you.

ciaranj commented 3 months ago

There is, it seems, a layer of indirection that I hadn't realised previously, between letsencrypt and the official list, via this project/logic. Presumably whether it breaks or not is entirely dependant on the overlapping of various releases of this list, the publicsuffix-go project and the letsencrypt boulder server releases.

Right at this moment, it appears that letsencrypt certificates are 'at risk' of failing to be created against azure provided virtual machine FQDNs, i.e. yes it is still broken.

Failed to create order: Error creating new order :: too many certificates already issued for "uksouth.cloudapp.azure.com". Retry after 2024-04-25T20:00:00Z: see https://letsencrypt.org/docs/rate-limits/

However, it seems likely that your most recent change will propagate through to letsencrypt in time and they will work again at some point in the future.

For me, personally, I'm not longer affected as I've added my own controllable layer of indirection between the letsencrypted hostnames and the azure FQDNs.

dnsguru commented 3 months ago

interesting dialog. also, weppos, who runs the project/logic mentioned above created THIS repo and is a fellow maintainer here.

simon-friedberger commented 3 months ago

@ciaranj Just to make sure we're on the same page, I don't think there is anything for us to do here. Letsencrypt suggested the correct solution which does not rely on the PSL.

@edwa001 You didn't specify what exactly the problem was that triggered #1966 but my understanding is that letsencrypt will rate limit certificate issuance per registerable domain. So, without any domain on the PSL all those domains count towards azure.com.

ciaranj commented 3 months ago

@simon-friedberger yes, we're on the same page here.

aarongable commented 3 months ago

@ciaranj Is it still broken? We were aware of some issues generating certificates, so we completely removed cloudapp.azure.com. from the PSL in #1966

That removal will only incidentally make things better for Azure Cloudapp customers requesting certificates from Let's Encrypt.

As long as *.cloudapp.azure.com was on the PSL, each regional subdomain (such as swedencentral.cloudapp.azure.com or ukwest.cloudapp.azure.com) counted as its own Public Suffix. This had two benefits:

  1. Every "registered" domain under a regional subdomain was considered a separate origin for web security purposes.
  2. Every "registered" domain under a regional subdomain had its own Let's Encrypt rate limits bucket

When the wildcard was removed, each regional subdomain became its own "registered domain" as far as Let's Encrypt rate limits are concerned, and all Azure customers in each region began consuming the same shared rate limit bucket. This quickly resulted in folks running into rate limit rejections.

With the entry wholly removed, the top-level azure.com counts as the rate limit bucket. All Azure customers in all regions will be consuming the same rate limit bucket. And the content that all Azure customers host will be considered to be in the same web security origin as azure.com itself, which seems like a major issue to me (This is, for example, why github only serves raw files on the separate githubusercontent domain.) I do not believe this is the desired outcome.

The reason the removal may make things slightly better for Azure customers getting Let's Encrypt certificates is that Let's Encrypt has a "rate limit override" in place for azure.com, meaning that the rate limit bucket all those customers are sharing is at least a larger bucket. However, it's not a huge bucket -- Azure last requested a rate limit override increase in 2018, intended for their own use, not the use of all of their customers -- and Azure customers will run into the rate limit again sooner or later.

ciaranj commented 3 months ago

@aarongable ahh, that override for azure.com is a missing piece that I didn't have. I couldn't work out why this wasn't an issue for a lot more people. I am really puzzled why the changes have come in, what seemed to be in place prior to these two changes appeared sensible to me! Thank you for that insight! (I also concur with regards to the security concerns, I raised them further up in the thread myself!)