thegreenwebfoundation / carbon.txt

A proposed convention for making it possible demonstrate that your infrastucture uses green power
Apache License 2.0
83 stars 5 forks source link

Document flow for checking a carbon.txt file #12

Open mrchrisadams opened 1 year ago

mrchrisadams commented 1 year ago

The flow is as follows:

  1. Check the domain name is a valid one.
  2. Check there if there is carbon-txt DNS TXT record for the given domain.
  3. Perform an HTTP request at https://domain.com/carbon.txt, OR the overide URL given as the value in the DNS TXT lookup.
  4. If there is valid 200 response and a parseable file, parse the file.
  5. If there is a no valid 200/OK response at https://domain.com/carbon.txt (i.e. a 404, or 403), check the HTTP for a Via header with a new domain, as a new domain to check.
  6. Repeat steps 1 through 5 until we end up with a 200 response with a parsable carbon.txt payload, or bad request (i.e. 40x, 50x) with no HTTP Via header.

Why do it this way?

This flow is designed to allows CDNs and managed service providers to serve information in a default carbon.txt file, whilst allowing "downstream" providers to share their own, more detailed information if need be.

Why support the carbon.txt DNX TXT record?

Supporting the DNS lookup allows an organisation that owns or operates multiple domains to refer to a single URL for them to maintain.

if you served traffic from a domain like cdn-domain.com, you would add a TXT record to cdn-domain.com, with the following content:

carbon-txt=https://actual-domain.com/carbon.txt

This would set an override url, to allowing multiple domains to point to the one carbon.txt file for a organisation.

The "override URL" also allows for organisations that prefer to serve their file from a .well-known directory to do so:

carbon-txt=https://actual-domain.com/.well-known/carbon.txt

This allows folks to support the .well-known convention of storing files in a clearly identified place where it makes sense to do so, without requiring people who do not know what a .well-known directory is, or for people who do not have control over what is allowed to write to the .well-knowndirectory in a server.

Why use the Via header?

Consider the case where managed-service-provider.com is hosting customer-a.com's website.

The managed service provider may be offering a CDN or managed hosting service, but they may not have control over the customer-a.com domain. They may not have, or want direct control over what a downstream user is sharing at a given url. However because they are offering some service "in front" of customer-a's website, and serving it over a secure connection, they are able to add headers to HTTP requests.

the HTTP Via header exists specifically to serve this purpose, and provides a well specified way to pass along information about a domain of the organisation providing a managed service, when the domain is different.

The link above outlines the spec, but for convenience you would add a header looking like so:

Via: 1.1 alternative-domain.com

Why use domain/carbon.txt as the path?

Defaulting to a root carbon.txt makes it possible to implement a carbon.txt file without needing to know about .well-known directories, that by convention are normally invisible files. Having a single default place to look avoids needing to support a hierarchy of potential places to look, and precedence rules for where to look - there is either one place to default to when making an HTTP request, OR the single override.