BRE-ISNIC / bre-doh-analysis

To DoH or not to DoH?
2 stars 0 forks source link

DNScrypt issues #4

Open bortzmeyer opened 4 years ago

bortzmeyer commented 4 years ago

"This is an attractive approach; simplicity is usually a good thing when it comes to cryptographic security, and UDP is faster than TCP." But a custom protocol will have less review and less testing than a well-established protocol like TLS. And DNScrypt lacks cryptographic agility (see RFC 7696).

(Not to mention the lack of standardisation.)

BRE-ISNIC commented 4 years ago

True!

But for what it's worth, cryptographic agility is to a significant degree to blame for the high profile vulnerabilities TLS has had. TLS 1.3 was important (and controversial) in part because it was reducing said agility; the current trends are towards simplification because as an industry we have not got a very good track record managing the emergent complexity of flexible protocols. Especially in security.

These points of view underlie some of the more nuanced technical opposition to DoH (it inherits huge amounts of complexity and bugs); I have the feeling that DNScrypt would appeal to people in that camp (DoT still inherits "too much" cruft from TLS).

That's how I read the current debates and trends anyway. But going into all that just felt like too much detail for the document.

jedisct1 commented 4 years ago

DNSCrypt does support cryptographic agility. Certificates include the list of supported algorithms.

In fact, the original version used XSalsa20-Poly1305, and later on, XChacha20-Poly1305 (widely implemented and now an IETF draft) was added as an option, and gradually became the one most implementations use.

Fragmentation is not an issue due to queries and responses being authenticated in addition to being encrypted.

DNSCrypt cannot be used for amplification attacks, even on UDP. Queries are required to be at least as large as the response. If a response would be larger, what is sent is a minimal response with the TC flag, and the client retries over TCP and can adjust the query padding accordingly.

Anonymized DNSCrypt cannot be used for amplification either. By design, relayed packets can never be larger than received packets (the header is removed, nothing is added).

jedisct1 commented 4 years ago

From an implemented perspective, DoH is way more complicated than it looks, not necessarily due to TLS.

Some of the reasons why:

Server-side

Client-side

A DoH stub resolver in a web browser can always fall back to the system resolver if needed. However, when the stub resolver is used system-wide (as a local resolver or on routers), this is not the case.

The system is typically configured to use the DoH stub resolver for resolution, but that resolver needs to resolve the name of the DoH server before it can be used. The answer to that is to use a fallback resolver. A 3rd party resolver, just to resolve the DoH server name.

But on many networks, port 53 is blocked. Either because enterprise networks force people to use a local resolver, or because users set up firewall rules themselves to avoid DNS leaks. So, that fallback resolver is not going to be accessible. Unless it is a local network resolver. But since the system hasn't been configured to use it, how to know its IP? The DoH client must implement DHCP, and allow per-network (so, this needs WiFi SSID detection, too) static configuration as well.

Eventually, the RR set of the DoH server name will expire. And, in fact, DoH server operators tend to use ridiculously low TTLs. As a random example, doh.powerdns.org uses a 300 seconds TTL. So, every 5 minutes (actually every 2.5 minutes on average due to decreasing TTLs), clients are assumed to ask the server its own IP address.

Of course, you don't want that to be a blocking operation, so clients need to prefetch this in background threads.

At the end of the day, the bootstrapping operation is quite painful to implement, and unreliable for users. Not to mention that arbitrary fallback resolvers cannot be used (for example, Cisco blocks the Cloudflare resolver names as them being considered proxies).

The bootstrap process also opens privacy concerns. If the fallback resolver and the DoH resolver are operated by the same entity, a unique IPv6 address can be returned for each query, allowing client fingerprinting even across IP changes. So, clients need to detect network changes and force a new resolution. Which is quite tricky if the stub resolver doesn't run on the router.

As an alternative, a DNS configuration can include the DoH server IP address. No fallback resolvers needed any more. But unless a certificate for the IP address exists, clients need to implement a way to directly connect to that IP, and establish a TLS session with a specific SNI value. Once again, this is not trivial.

Other random things a client must implement:

BRE-ISNIC commented 4 years ago

Thank you for the comments, @jedisct1 - I'll try to update the document to take these points better into account. Although I don't want to spend a great deal of time explaining other protocols, I do want even minor statements to be factually correct.

jedisct1 commented 4 years ago

Awesome! Thanks for writing this document.

Also There is a third security concern, addressed directly by none of these, which is client anonymity is not correct. This is exactly what Anonymized DNSCrypt provides, and it is already deployed, with public relays everywhere (https://github.com/DNSCrypt/dnscrypt-resolvers/blob/master/v2/relays.md).

bortzmeyer commented 4 years ago

"Some servers support POST, some only support GET" If so, they are wrong (RFC 8484, section 4.1).

That's why a compliance testing tool would be a good idea https://framagit.org/bortzmeyer/homer/issues/9

bortzmeyer commented 4 years ago

"so Cache-Control must be used instead" Which is indeed what the RFC says, for all servers.

jedisct1 commented 4 years ago

There are valid reasons to only support GET. All reverse proxies and CDNs can't cache POST queries. Existing clients, including Firefox either automatically try both or have the verb as a setting. This is an expected feature.

Cache-Control should be used since HTTP/1.1, but is not what all HTTP libraries use. Some still unconditionally rely on Expires. One has to carefully verify that the library they use can actually parse Cache-Control. If not, caching will appear to work just fine. And suddenly break when used with Google. So, this is another thing to worry about when implementing a client.