Private Network Access integration

annevk commented 3 years ago

As part of Private Network Access integration we need to expose a field on the connection it can use to segment the internet. In a non-proxy setup this would the IP address that ends up being used (see #1245 for details on making that more clear).

In a proxy setup where the proxy is responsible for resolving domains (and perhaps also when it isn't?) it's less clear what the right solution is. As I understand it from the Private Network Access draft the idea is to go with the IP address of the proxy, meaning that there is effectively no segmentation: https://wicg.github.io/private-network-access/#proxies.

Aside: it says there that it is fine to discover if a user is behind a proxy, but is that really true? Can you discover that reliably today?

cc @letitz @martinthomson @sleevi

sleevi commented 3 years ago

Aside: it says there that it is fine to discover if a user is behind a proxy, but is that really true? Can you discover that reliably today?

I believe it would be best described as "placeholder for discussion", while wanting to be explicit in calling out the path that Chrome's implementation took (in part, for ease/expediency). Ultimately, similar to the complexities around file:// origins and scoping permissions (to the file vs to the directory vs to the scheme), there's complexity tradeoffs; if we blocked proxies from making such connections at all, that'd almost certainly be more disruptive. Treating them as 'public' implicitly would also cause issues with many existing proxy uses (where the proxy is used to access local resources, as opposed to being bypassed for local resources)

It's certainly come up several times before, particularly with Resource Timing APIs. For example, nextHopProtocol has the potential to disclose the proxy information (since the proxy is, by definition, the next hop). Network Error Logging similarly can give indications of the presence/absence of a proxy, as can using mTLS.

Ideally, we would not provide any such differentiation, but fundamentally, a proxy is a different connection path than direct, so anything keyed of connection properties (e.g. the IP address) necessarily offer a differentiation.

martinthomson commented 3 years ago

This probably just comes down to what is possible. In particular, if the proxy is doing name resolution, the browser remains ignorant of the IP address of the server and so can't make judgments about it.

There are three cases where the proxy is used, but the browser knows the IP address: when the server is identified by an IP address, when the browser does name resolution, or when the proxy is bypassed. In these cases, we can enforce constraints. In other cases, perhaps ignorance is a valid defense.

If ignorance is not acceptable, then browsers really need to be doing name resolution always, but that's a pretty big change. And that might be brittle in some proxy cases because the browser and proxy can have different views on the network and different reachability.

On the aside: I'm not sure that use of the proxy is always visible to a site in these cases. Ryan mentions nextHopProtocol, but I thought that only applies to HTTP and not HTTPS when it came to proxies. And we don't generally expose the IP address that was used (for good reasons; though I could have missed an attempt to add this). Mutual TLS reveals TLS interception, but not CONNECT proxies that keep their sticky fingers to themselves. I don't know network error logging. There might also be really subtle stuff like timing side channels or MTU that might be used to infer proxy use, though those might not be strong enough to confirm it.

For my own aside: a boolean flag might not be the right answer here. Ryan's point about file:// means that we should consider both lateral movement and movement toward nearer hosts (i.e., loopback).

whatwg / fetch

Private Network Access integration #1247