privacycg / proposals

New proposals in the Privacy Community Group
https://privacycg.github.io
122 stars 5 forks source link

DNS TLD for Privacy #36

Open mnot opened 1 year ago

mnot commented 1 year ago

Rumour has it that ICANN is considering another run of the new gTLD program.

Last time around, Google registered .app and runs it with an additional semantic: all domains in that TLD are automatically on the HSTS preload list, effectiely enforcing HTTPS for any server with an .app domain.

What if something similar were done with a privacy focus? For sake of argument, let's say .priv[^1] is registered, and there's agreement that browsers will not allow any third-party requests from those domains. The registrar might also insert contractual terms that limited first-party tracking as well.

Sites with .priv domains could then beliveably market themselves as privacy-focused, giving them an advantage with privacy-concious users / customers.

This would also provide an opportunity for browsers to try out new techniques for privacy in a 'sandbox' that's already privacy-focused.

Just thinking out loud here - any interest? Obviously it'd need good browser support. Best path forward might be to define an opt-in signal for sites first, just like HSTS did.

[^1]: I suspect .priv is not the right name here, but let's not bikeshed that at the moment

martinthomson commented 1 year ago

I think this needs careful consideration regarding the scope of what you might be requiring these hypothetical sites to commit to.

There are obviously some privacy-related things that are technically enforceable and so might engage machinery in browsers. I don't think that that is limited to the use of third-party requests.

There are also some privacy-related things that are legally enforceable and so might engage an entirely different sort of machinery in various jurisdictions.

And of course this is hardly a clean bifurcation. Some practices bleed across between the two. If your intended scope includes navigation tracking, then that is sometimes identifiable by a browser and sometimes not.

What concerns me more is that this specific shape risks ending up as a marketing stunt more than delivering people with meaningful privacy advantages. A gTLD that was specifically "private" ends up positioning itself relative to all the other gTLDs. And this is just one of many dimensions that are relevant when it comes to making decisions about which domains to interact with and how to do so.

Google attaching what amount to public hygiene conditions to registrations of .app domains certainly helps improve online security. But there is no expectation or implication that non-.app domains are any less secure because security is not the primary reason for domains in that gTLD. The requirement is just an easy on-ramp to a good practice for sites looking to use an otherwise appealing gTLD.

Another way of approaching this is to have a site-level commitment to a particular policy, with markers that allow for various types of enforcement, automated or otherwise. GPC has something like that already, which presumably intends to engage with the jurisdictional mechanisms. A browser-level one can effectively be described with CSP (connect-src 'self', but I think that you are probably looking for something that makes intent more explicit.

-- not-chair-comment

mnot commented 1 year ago

What concerns me more is that this specific shape risks ending up as a marketing stunt more than delivering people with meaningful privacy advantages. A gTLD that was specifically "private" ends up positioning itself relative to all the other gTLDs. And this is just one of many dimensions that are relevant when it comes to making decisions about which domains to interact with and how to do so.

... which is why I noted that .priv might not be the best name.

Generally I agree that it's better to talk about the specific mechanisms to improve privacy rather then how they're invoked. What's interesting about embedding that mechanism in the name is not only that it's visible, but also that it's immutable -- to change their commitment, a site has to change its name. Of course, there are other ways to make things sticky on the Web.

martinthomson commented 1 year ago

there are other ways to make things sticky on the Web.

:) HTTP/1.1 301 ... Location: https://site.com/...

annevk commented 1 year ago

I would be more comfortable with this if there was an equivalent to HSTS (preloading) for the kind of feature you envision.

mikewest commented 1 year ago

An alternative might be to look at the other parts of a hostname, locking in certain properties using a prefixing trick similar to what we've done for cookies. https://www.youtube-nocookie.com/embed/dQw4w9WgXcQ could be https://nocookie.youtube.com/embed/dQw4w9WgXcQ, for example, instructing the browser to ignore cookies for that host. I think that might be a useful concept for other kinds of properties (you could imagine ed25591-[publickey-goes-here].example.com as enforcing a signing requirement, or tls.example.com requiring TLS. This has the same drawbacks as cookie prefixes, most notably the requirement of coming up with meaningful bundles of the properties you actually want to offer developers, but the concept doesn't seem out of the question to me.

kdenhartog commented 1 year ago

Would it make sense to make this as a scheme rather than a TLD? Seems like that would create better backwards compatibility here rather than requiring a separate TLD for this. Additionally, I think the troublesome question here is what features are guaranteed when doing this?

E.g. if I go to privhttps:// what guarantees am I getting from this site and what enforcement mechanism is the browser offering for this so that it doesn't break compatibility? For example, would we expect that no scripts would run on a page like this? How might we guarantee that no tracker pixels have been added to subvert known methods?

Overall, I think there's some good ideas to adding a class of website identifiers that are easily recognisable as private, but from the discussions I had with our team we're not sure if this is the best path to approach it.

pes10k commented 1 year ago

+100 for having this functionality, but +100 too for moving the signal to the protocol (or something similar) instead of the TLD

That has the benefit of the user knowing the privacy-benefits of the site before they visit it but

  1. without conflating domain names with privacy guarantees (since we currently explicitly tell people to NOT think of sites like secure.example.org as more secure than example.org)
  2. allowing sites to upgrade / opt-in for free (no new domain registrations needed)
  3. lots of sites have selected their domains for specific reasons (e.g., https://cr.yp.to) and we'd presumably like them to be able to use this more-secure feature
mikewest commented 1 year ago

(since we currently explicitly tell people to NOT think of sites like secure.example.org as more secure than example.org)

  1. We discourage that today because it's not true. Presumably if we made the hostname meaningful, we'd stop discouraging folks from making such intuitive leaps. :)
  2. I think the hostname based mechanism is more useful for web developers (by choosing subresource/widget hosts that lock in strong security properties) than for users directly. I don't think there's a huge appetite in asking users to type in a public key hash, for instance, though there might be useful benefits to a developer who relies on subresources served from such an origin.
pes10k commented 1 year ago

We discourage that today because it's not true

Sure, point taken, but i expect that its easier for folks to understand and follow security advice that is consistent and doesn't have a bunch of decision-tree branches. Right now we tell people "the domain tells you who you're talking to, the protocol tells you [partially] about the privacy/security of that conversation".

My point is that Keeping the kind of restriction discussed in this thread to the protocol (https-priv:// or whatever) is compatible with existing security advice in a way that the TLD isn't

mikewest commented 1 year ago

Right now we tell people "the domain tells you who you're talking to, the protocol tells you [partially] about the privacy/security of that conversation"

I think you're overestimating the protocol's place in users' minds. Browsers have generally aligned on stripping the protocol from the address bar when the user isn't directly interacting with it, relying instead on iconography to represent (negative) security properties.

My point is that Keeping the kind of restriction discussed in this thread to the protocol (https-priv:// or whatever) is compatible with existing security advice in a way that the TLD isn't

Hostnames have the nice property of being backwards compatible with browsers. Protocols take some time to gain adoption, have more pervasive impact on core specifications (Fetch, for instance) and fail hard when unsupported.

That said, I don't have terribly strong opinions about this. I think the idea of creating a mechanism that allows us to lock in a set of guarantees a priori such that we can reason about them at request time is good. If folks align around protocols, I won't argue about it. :)