Port redirection introduces new cross-protocol attack vector

davidben commented 3 years ago

Port numbers are a mess. :-( They are part protocol distinguisher (both sides agree on the protocol) and part endpoint virtualizer (multiple services on one IP). The virtualizer use case has all but eroded the distinguisher, but some corners still depend on the distinguisher.

SVCB and HTTPS allows the DNS to replace port numbers, adding a new instance of this. A network attacker can already swap ports and more, but not where the client is in a privileged position on the network. Consider services listening on localhost or a private network. Messing with the ports then means the attacker can direct the client to speak one protocol to a service that's expecting another protocol, mounting a cross-protocol attack. This is especially a concern for non-browser scenarios that may not be used to following URLs everywhere.

As a concrete example, consider a production network, with internal services on private addresses. This network does not run user-facing software like browsers, so it hasn't been forced into the browser threat model where anything can fetch any URL. This network runs a service, such as a CDN or web crawler, that fetches externally-controlled URLs. Provided those services only fetch URLs with standard ports, the network can assume a separation between internal non-HTTP services. (And if any internal HTTP services use 80 or 443, we're not cross-protocol and HTTP-based mechanisms like headers work.) Once folks implement SVCB, this assumption breaks and that external URL can direct a TLS ClientHello to any internal service by specifying a private IP and listening port. Combine this with DNS rebinding and TLS session resumption, and the attacker can even include arbitrary data in the middle of the ClientHello.

In the browser threat model, this kind of thing is hopeless in general. Even so, we maintain a list of blocked ports for some existing protocols and had to introduce new ones recently. (That attack confused a middlebox rather than an endpoint, but the general issue applies.) The web browser fix for this attack in SVBC/HTTPS is "obvious": rerun the bad port check deep in the HTTP stack. However, this only works because browsers usually have bespoke HTTP stacks, and we're already very used to following URLs everywhere. This may not be as simple for other applications. Even in browsers, not the Fetch spec only applies this check at the entry to HTTP, and on redirects. There is no spec infrastructure for extra bad port checks. (The bad port check "authenticates" that the port is not a blocked one. SVCB/HTTPS bypasses that authentication.)

Alt-Svc had this problem too. However:

It seems Alt-Svc forgot to document this. A browser implementing Alt-Svc needs to, like SVCB/HTTPS, replicate the bad port check after Alt-Svc. A non-browser HTTP client needs to do... something.
Alt-Svc requires state to be useful, so it is less likely non-browser HTTP clients implemented it. SVCB/HTTPS does not, which means we're more likely to trip the issue in those clients.
Not all browsers implemented Alt-Svc, since Alt-Svc had a mistake in its handling of ALPN.
SVCB supports even more protocols beyond HTTP. Those protocols may not have HTTP's history of following random URLs, so applications may be even more unprepared for this.

One mitigating factor is that, for http: and https: URLs, this SVCB/HTTPS records only work for HTTPS. An attack would have to collide a TLS ClientHello against the target protocol, which less likely than plaintext to plaintext. That said, TLS may still intersect with other protocols. The target protocol could happen to use a TLS code point as a delimiter and be too lax to notice other bits. And then there's the ticket issue above.

And then with SVCB, there's no guarantee the source protocol uses TLS, so all bets are off. If I'm reading it right, SVCB is defined for ftp: and other schemes.

What to do about this mess, I'm not sure. Removing port redirection would avoid this problem, and it indeed seems not worth it for most HTTP use cases over TCP. However, over UDP, QUIC benefits from port redirection, and protocol confusion is possible over UDP too.

We could go the security considerations route and punt to the application. But with what guidance? What's the right behavior for a general-purpose HTTP library? Should it import the WHATWG bad port list, despite that being browser-specific? That may not be enough for the production network example. Should it expose a callback? What if the application didn't know about the callback and the library is updated... should SVCB/HTTPS records be opt-in? Do we just handwaive and assume TLS won't collide with much? What about tickets? If we assume TLS, what do we tell other protocols? Do we ban SVCB for schemes by default and expect each scheme to opt-in after analyzing its protocol collision risk? How does one even avoid collision with an arbitrary, potentially very lax, protocol?

ericorth commented 3 years ago

Summarizing the scenario to make sure I understand correctly: Some client is making URL requests, and to meet some security assumptions/expectations, it is deliberately only making those requests for a specific port or at least blocking requests for some denylist of ports. The client then hands it off to some library to handle the protocol (eg a general purpose HTTP library). That library supports SVCB, and could follow port redirection, thereby breaking the client's security assumptions if the request then goes to a port the client would have blocked.

Did I get that right?

davidben commented 3 years ago

Yeah, that's what's going on in terms of mechanism. And in terms of security expectations, we do in practice care about what ports get used where. Browsers have that list of bad ports. And any server listening on a private address for some kind of access control is implicitly assuming that folks connecting to that private address are doing so intentionally.

The whole expectation is questionably sound at this point given how browsers handle URLs, but it hasn't completely eroded away even in browsers. I suspect there are environments which would care even more, though I'm mostly a browser person.

bemasc commented 3 years ago

Lots to think about here, but to your last point:

Do we ban SVCB for schemes by default and expect each scheme to opt-in after analyzing its protocol collision risk?

Section 2.4.3 currently says

Each protocol scheme that uses SVCB MUST define a protocol mapping that explains how SvcParams are applied for connections of that scheme.

so ostensibly we are already in that regime.

ericorth commented 3 years ago

My initial inclination is to say that this is an issue between the client and its libraries and just write it up as security considerations. With or without SVCB, the client as a whole is always still in power over what ports it will allow connections to to be compliant and safe for whatever protocol the client is using.

And for protocols that are only safe over certain defined ports, even if their SVCB scheme definition doesn't ban use of port, presumably any well-written library for that protocol should be responsible for ensuring only the safe ports are used regardless of whether or not SVCB is supported. Even for open-to-most-ports protocols like HTTP, I would argue it would be best for the library to maintain its own bad-port list and enforce such things, but I'm guessing many HTTP libraries don't currently do that.

davidben commented 3 years ago

What's a safe port? The Fetch list is just some standard ports that may collide with HTTP. It doesn't cover, say, a random service running on some random port. For instance, we had a report somewhere about TLS with attacker-controlled session tickets colliding with someone's local development memcached instance. In the browser threat model, this is sadly hopeless anyway (POSTs over cleartext HTTP provide too much attacker control and URLs have had custom ports for a loooong time). In a production environment with a production memcached, it may be more meaningful.

brian-peter-dickson commented 3 years ago

This thread raises a separate but related issue. What is being discussed currently references the client, the server, and the DNS authority server(s). However, there are other relevant parties, at least in some very important environments: the networks involved. Specifically, in any Enterprise environment (typically using RFC1918 space and NAT to the public internet), the existing mechanisms used for monitoring/alerting, filtering, and (regardless of the perceived legitimacy thereof) MITM TLS (decrypt/re-encrypt), generally assume that HTTPS is on port 443, and ONLY on port 443.

The port redirection aspect of HTTPS and SVCB is thorny, in several regards:

Use of port redirection on public DNS endpoints facilitate evasion of this sort of monitoring
If significantly abused, the reaction may be to block these new RRTYPEs at the DNS level blindly (regardless of port number used)
This also impacts the AliasMode use case, since the same RRTYPE is used for AliasMode and ServiceMode
Blocking would thus impact AliasMode, and basically make the whole SVCB/HTTPS development/deployment a wasted effort

How substantial is the likelihood that non-standard ports will be used in a meaningful way outside of private environments? Is there any realistic way to decouple the port change from the rest of the SVCB/HTTPS design, maybe with additional RRTYPEs? For the use cases that require non-standard ports, is the extra round-trip requirement to obtain the extra record (perhaps only by the resolver to populate the cache) a huge obstacle, if an extra RRTYPE is employed for this? Assume the standard port usage would not require the extra lookup.

E.g. instead of the actual port number in the current field, have a flag (1 or 0) for "look up the port number". That would facilitate blocking the other RRTYPE on enterprise firewalls (or DNS firewalls), without affecting standard port usage.

davidben commented 3 years ago

The assumption that HTTPS is only on port 443 has long been broken for those environments and any others where users run web browsers. We'll need spec text to support rerunning the Fetch bad ports check, but otherwise that one should be fairly straightforward.

brian-peter-dickson commented 3 years ago

I think you are rejecting this prematurely, as my concern isn't that non-443 might be used, but rather that non-browser people operating these firewalls (or implementing firewall software) might overreact in a way that is harmful to HTTPS/SVCB generally.

I am not arguing whether the assumption is valid.

However, the question I asked is definitely pertinent. Would you please answer it, if you know, or if anyone asking for the port number to be included in the core spec has articulated this as a requirement? The question is: How likely is it that non-standard ports are going to be used?

And the followup question is: How critical is that to the use case for HTTPS/SVCB, or that the port be obtained on the same RRTYPE as the other element(s)?

tfpauly commented 3 years ago

Practically, there are a number of QUIC HTTP/3 endpoints that run on ports other than 443. Since Alt-Svc lets you do this already, it's certainly possible and will happen.

I'd agree with David that this just needs documentation in the security considerations.

If a network wants to firewall a port, it is far better off blocking that port than trying to do so indirectly by blocking SVCB records.

brian-peter-dickson commented 3 years ago

There are 65535 possible ports. Which ports would a network want to firewall? They would need to block ALL of the ports EXCEPT the ones they allow. I.e. you're suggesting reversing the semantics on firewalling of ports.

The larger question is about implementing security AT SCALE, not whether it is possible to respond on an instance-by-instance mechanism. If it is easier for a network operator to block SVCB/HTTPS records than it is to block ports, I guarantee that at least some of them will do it this way. This would be BAD for the usability of SVCB/HTTPS, which is why I am asking for clarification on the usage of non-standard ports. Could you please point me to the specific documented QUIC HTTP/3 endpoints? Is that a particular operator's choice, or is it part of a standard of some sort? Are they using a specific (small) set of ports, or is it either a large set or an unspecified set? Is that use case mandatory and what depends on that?

Plus, these are DNS records, which are subject to change without notice, which really makes the whack-a-mole nature of attempting to block ports a terrible idea.

My biggest concern is using this as a method to back-door use of DoH if/when networks decide to implement anti-DoH mechanisms, which could lead to the blocking of SVCB records, which is an anti-goal for everyone participating here. Specifically, without regard to the DoH thing, I want there to be no reason for anyone to ever want or need to block SVCB records, so that SVCB/HTTPS can be used for the main thing for which it is the ONLY solution: standardized, interoperable CNAME-at-DNS-zone-apex implementation.

brian-peter-dickson commented 3 years ago

And just to clarify, I'm not proposing blocking alternate ports, nor specifically wanting to make it easy to block alternate ports. However, I do think the alternate ports needs to be functionally "severable", so that those who do want to block alternate ports, such as the Great Firewall of China (GFC), have a way that doesn't require disabling SVCB in its entirety.

For context, GFC have taken to blocking TLS 1.3 and ESNI, which is a terrible outcome for those who liked other aspects of TLS 1.3 (e.g. session resumption via tickets). Having severable parts might have resulted in GFC allowing through the parts it didn't object to, and thus would have kept the good new functionality available in TLS 1.3.

This may be not be easy to reach consensus on, but if the only time an extra DNS query were required was WHEN an alternate port was needed, I don't think the consequences are substantial.

Right now, today, ALPN works via the ALPN HTTP header stuff, and the ALT-SVC stuff works via the HTTP header stuff. Learning the alternate port requires the initial TLS connection. That is a LOT more expensive than an extra DNS lookup, and yet may require an extra DNS look in addition, if the ALT-SVC is on a different hostname.

Right now, it also means that for ALT-SVC to even work, the initial connection MUST be on a standard port, OR reached via a URI that includes the ":port" specifier (which is visible to the browser itself).

I'm not suggesting keeping the ":port" in the URI at all, only the use of an extra RRTYPE at the target name, and a flag in the main SVCB or HTTPS record to trigger the lookup (thus avoiding the extra lookup EXCEPT when an actual alternate port is required).

So, in the current spec, replace "port=NNNN" with "alt-port" (flag), and then querying some particular RRTYPE at the owner name (pick a name, maybe SVCBPORT or HTTPSPORT or something like that, or even just ALTPORT). So, not entirely removing the alternate port, only turning the alternate port into a single extra de-reference lookup (whose RR would be cached at a resolver, so extra round trips on DNS queries would be minimized, and where the extra record could even be put in the Additional section, perhaps as part of the handling by the Resolver.)

bemasc commented 3 years ago

@brian-peter-dickson TLS 1.3 is a finalized, widely deployed standard that empirically continues to work well in every country. ESNI is a draft extension that is still under development. They are indeed entirely "severable".

For entities who (1) can intermediate the DNS, and (2) are interested in observing or restricting which protocols are used on their network segment, SVCB actually makes life much easier, by encoding the protocol into the QNAME of each query. This provides a far more direct and reliable protocol indicator than the observed port number. Also, any "port=..." SvcParam in the response is of course plainly visible to a DNS intermediary. Thus, there is no rational basis for blocking the SVCB QTYPE in pursuit of a port-oriented control scheme.

I'd like to keep this issue focused on @davidben's concern about guidance for cross-protocol port confusion problems.

bemasc commented 3 years ago

I've added text to highlight this concern in #280. Please review.

bemasc commented 3 years ago

This should now be resolved by the change in #279.

MikeBishop / dns-alt-svc

Port redirection introduces new cross-protocol attack vector #279