WICG / private-network-access

https://wicg.github.io/private-network-access/
Other
52 stars 21 forks source link

Timing attacks #41

Open jonathanKingston opened 3 years ago

jonathanKingston commented 3 years ago

Even if #21 is removed, I think the website may be able to deduce when it's served 'locally' through ssh tunnel, fiddler etc.

A public website served locally could:

As part of the non-normative text, it may be worth mentioning that user agents should allow the user to override this protection. (potentially linking to the feature being added to WebDriver to be exposed as user flags / prefs etc)

jschuh commented 3 years ago

Even if #21 is removed, I think the website may be able to deduce when it's served 'locally' through ssh tunnel, fiddler etc.

I may be missing something, but it doesn't seem like the proposal introduces any unique risk regarding locally configured proxies (what you refer to as being "served 'locally'"). That is, an attacker may still be able to leak some network information due to errors/discrepancies in the handling of different protocols/hosts (e.g. inconsistent DNS/HTTP[S] handling, PAC rule differences, timing differentials, etc.).

That's not to dismiss these risks. I'm just not understanding how the risks could be exacerbated by the partitioning of private and public hosts. Rather, it seems clear that the potential for such leakage would be dramatically reduced by this partitioning.

As part of the non-normative text, it may be worth mentioning that user agents should allow the user to override this protection. (potentially linking to the feature being added to WebDriver to be exposed as user flags / prefs etc)

That's a good point. In addition to development and testing needs, managed enterprises will also have need of client policies for selectively controlling how this protection is applied to legacy systems that (at least initially) won't be able to support the necessary CORS logic.

Seems like it's worth providing some non-normative text highlighting UA freedom to implement alternative configuration mechanisms for exactly these kinds of scenarios.

jonathanKingston commented 3 years ago

I may be missing something, but it doesn't seem like the proposal introduces any unique risk regarding locally configured proxies (what you refer to as being "served 'locally'"). That is, an attacker may still be able to leak some network information due to errors/discrepancies in the handling of different protocols/hosts (e.g. inconsistent DNS/HTTP[S] handling, PAC rule differences, timing differentials, etc.).

That's not to dismiss these risks. I'm just not understanding how the risks could be exacerbated by the partitioning of private and public hosts. Rather, it seems clear that the potential for such leakage would be dramatically reduced by this partitioning.

Right for most users this is likely to be a drastic improvement of privacy and security.

I was looking at specifically: What information from the underlying platform, e.g. configuration data, is exposed by this specification to an origin?

Websites may be able to deduce they are served locally already, however once user agents implement this spec:

jschuh commented 3 years ago

A site that is proxied locally may still be able to access local resources (127.0.0.1 etc) as the page is still considered "local".

Ah, I think I see. This is about running a proxy on localhost where the browser sees the proxy as the terminating host endpoint, rather than the remote destination host. I was confused given that the far more common end-user scenario involves a transparent forward proxy, where the browser sees the remote destination host as the terminating endpoint (e.g. typical client "security" software, a standard Fiddler setup, SOCKSv5 over an SSH tunnel), in which case this shouldn't expose any new information.

Yeah, even without document.addressSpace, if the proxy is the terminating endpoint I assume that the remote server will be able to probe in one way or another to eventually detect what kind of network it's being proxied on (i.e. public, private, or local). That stated, even without the partitioning, the same basic effect should be achievable. It's just that the partitioning would make the probing more efficient because the partitions reduce the space an attacker needs to probe (i.e. fewer combinations of host/port/resource to brute force). That stated, I expect that in practice the brute force space is reduced far more by predictable device and software signatures.

I am curious as to how significant of a concern this sort of locally terminating proxy situation is in practice. I assume it's typical enough for developers SSHing into a remote server, mainly because it's simpler than configuring a SOCKSv5 proxy. However, that's a fairly niche use in the grand scheme, and typically dealing with a trusted remote host anyway. Whereas the security and privacy implications of injecting an untrusted host into a local/private origin seem concerning independent of the partitioning.

The method of overriding if a local domain is considered local will likely be user agent specific

Yeah, doesn't seem like there's any way around this since the only safe way to expose it is through things like flags and enterprise policy.

jonathanKingston commented 3 years ago

Yeah I'm not sure on the prevalence of this setup, but I think it would do more than expose that the website is being served in this way. It potentially could remove the partitioning benefit.

Yeah, doesn't seem like there's any way around this since the only safe way to expose it is through things like flags and enterprise policy.

Yeah shame there's not a file://.well-known/ for this purpose or similar.

jschuh commented 3 years ago

Yeah I'm not sure on the prevalence of this setup, but I think it would do more than expose that the website is being served in this way. It potentially could remove the partitioning benefit.

Maybe I'm misunderstanding, but I can't imagine any side effect that could remove the partitioning benefit in the general case. To put it in context, the blocking that this mitigation introduces will provide a massive reduction in attack surface for a particularly vulnerable class of software and devices—ones that tend to have privileged access on internal and hosts networks. So, it's hard to understate the huge safety benefit of that change.

Of course, any differential blocking mechanism will inevitably introduce the potential for side channel information leaks, and those need to be taken seriously. Accepting that, the leaks we can identify seem extremely narrow, of limited value, and were likely exist prior to partitioning. Yes, it's certainly possible that worse side-channel leaks remain as yet undiscovered, but even if that happens you're almost certainly bounded by the state of information leakage we have today, meaning this proposal would still net a massive reduction in attack surface.

Yeah shame there's not a file://.well-known/ for this purpose or similar.

Unfortunately, I think the last decade or so of social engineering attacks have shown such simple mechanisms to be popular targets for abuse. That's why browser makers have increasingly gravitated towards native UX that requires clear end-user consent, or policy mechanisms that demonstrate system administrator control. Obviously that doesn't preclude more standardization in those areas, but the exact details will always have to carefully balance a number of concerns.

letitz commented 3 years ago

Thanks for the feedback!

I agree that even without window.addressSpace, websites might be able to tell whether or not they are being proxied locally. Certainly, if they can successfully load <img src="http://192.168.1.1/famous-router-brand-logo.png">, then they can tell whether or not they are classified as being in the local or private address spaces.

I'm not sure timing attacks will be helped much with this change though - a website could always measure how long it takes to load one of its own images, and gather whether or not it is likely sitting in the user's private network?

Edit: the above would not work with proxies, since timing to the website would not be affected.

I guess there would be timing differences to observe indeed: in the case of failure, instead of needing to set up a TCP connection + send and receive HTTP requests and responses, the UA would only set up the TCP connection. One private network RTT should be observable. Maybe introducing some timing jitter on the order of tens of milliseconds would be a good enough mitigation?

jschuh commented 3 years ago

Jitter typically doesn't work when you can take multiple samples (once you have enough samples you can de-noise the jitter). Clamping to minimums can work, but that would introduce significant delays for all local connections. That stated, the attack scenario itself just doesn't seem to warrant additional mitigations.

Consider the effective threat here in the context of its preconditions:

I'm just struggling to see how the impact and complexity of additional mitigations would be warranted given this combination of rare and inherently risky behavior—proxying a potentially malicious external site as a locally terminated endpoint—combined with how small the differential of leaked information is.

Perhaps there's something I'm missing, but it seems like the most appropriate action here is to just make sure this case is clearly noted as a potential risk.

letitz commented 3 years ago

Jitter typically doesn't work when you can take multiple samples (once you have enough samples you can de-noise the jitter).

Right, I gave it some thought overnight and came to the same conclusion.

As for the threat scenario: I don't believe the proxy needs to be potentially malicious, and I wonder how prevalent intranet proxies are - not just localhost proxies. It seems that gathering some data on this would help determine the way forward.

Agreed that in the meantime noting this risk in the spec is a good idea. I'll draft a patch.

letitz commented 3 years ago

Started drafting a section, which made me realize that I'm not sure the timing attack would work.

In the status quo, requests to the private network can have 4 different results:

  1. immediate failure: no route to host (e.g. invalid subnet)
  2. quick failure: connection refused
  3. success (a bit slower than #2 due to the request/response round trips)
  4. slow failure: connection timed out

In the world envisioned in this specification, #2 would be augmented with:

2a. quick failure: blocked by UA 2b. quick-ish failure: blocked by preflight 2c. slow failure: preflight timed out

It seems hard for a website to determine whether it is observing #2a, #2b and #2c results vs #2, #3 and #4 results based on timing and success/failure alone. I guess #2b might be observable as a different cluster of failure timings? I would expect one needs a handful of such timings to start distinguishing those from noisy outliers.

Am I missing something?

jschuh commented 3 years ago

I think you meant 3a-3c, rather than 2, because the change here is that connections which would previously have succeeded now get split into the a-c cases.

Accepting that difference, this framing seems correct, and gets to my earlier point that "even without the partitioning, the same basic effect should be achievable." Because an attacker is still going to probe for common services on known local/private address spaces. It's just that the partitioning could make the brute forcing more efficient by introducing more discriminating error patterns, faster failure cases, or some other differentiating factors.

Circling back to the proxy discussion, I think there's some confusion here due to inconsistent terminology, and I worry some details are getting lost. To clarify, for the purposes of this proposal there are broadly three types of HTTP(S) proxies:

  1. Fully transparent - configured as a network gateway; invisible to the browser
  2. Explicitly configured - configured via PAC, policy, or local settings; browser is fully aware of proxy
  3. Port forwarding - proxy is the locally terminating host and data is forwarded to remote; invisible to the browser

We don't care about 1 because nothing changes. Case 3 is the SSH tunnel case, and is niche enough that I would argue we also don't care. That leaves by far the most common case of 2, where there is an explicitly configured proxy.

Explicitly Configured Proxy

I dug into this a bit more yesterday, and frankly it's more complicated than I remembered. Sometimes the browser performs name resolution and knows the real address of the destination host, but more often it does not. If the remote address is not known, we could certainly just use the proxy address, but that would largely invalidate the protection in these cases (perhaps this is what @jonathanKingston was getting at above).

I think the only real solution here is to provide explicit guidance to HTTP proxy implementers on how to tag proxied hosts with the proper address partition via a header containing the proposed treat-as-public-address CSP directive. In the absence of a tag or otherwise resolving the address of the destination host, the UA would need to fall back to the address of the proxy. (And as with most things, UAs will likely want to provide enterprise policies or other configuration options to fill in gaps.)

Of course, relying on proper tagging from the proxy means that users behind these devices will not receive the benefit of this protection until proxy implementers update their behavior. However, the overwhelming majority of these situations are almost certainly going to be enterprises and client security software, where there are at least some incentives to adoption. And if adoption does turn out to be a problem over time, UAs still have an opportunity to intervene against dangerous patterns by virtue of knowing that a proxy is in use and potentially alerting the user to the problem.

letitz commented 3 years ago

I think you meant 3a-3c, rather than 2, because the change here is that connections which would previously have succeeded now get split into the a-c cases.

Indeed, that makes more sense.

Accepting that difference, this framing seems correct, and gets to my earlier point that "even without the partitioning, the same basic effect should be achievable." Because an attacker is still going to probe for common services on known local/private address spaces. It's just that the partitioning could make the brute forcing more efficient by introducing more discriminating error patterns, faster failure cases, or some other differentiating factors.

I'm still not entirely convinced the spec would make brute forcing more efficient, but I'll concede that it smells like it just need a bit more cleverness to exploit. In any case, it will not improve on the current situation, nor will it introduce novel capabilities.

I do believe a website might be able to determine its own address space by observing behavior differences in fetch responses for itself vs a known-public document. This would be easily achieved with a public child frame, if sandboxed iframes ever provide a way to sandbox the address space as discussed in #26. I'll mention that in that issue, since we should be careful there. This could unfortunately be used to gate attacks such that they remain invisible to non-vulnerable users.

Explicitly Configured Proxy

I dug into this a bit more yesterday, and frankly it's more complicated than I remembered. Sometimes the browser performs name resolution and knows the real address of the destination host, but more often it does not. If the remote address is not known, we could certainly just use the proxy address, but that would largely invalidate the protection in these cases (perhaps this is what @jonathanKingston was getting at above).

I believe so, yes. I have also been wondering how to treat such proxies, and my current best thought is to use the proxy IP address. It does open a hole in the protection for users of e.g. a local ssh tunnel SOCKS proxy.

I think the only real solution here is to provide explicit guidance to HTTP proxy implementers on how to tag proxied hosts with the proper address partition via a header containing the proposed treat-as-public-address CSP directive. In the absence of a tag or otherwise resolving the address of the destination host, the UA would need to fall back to the address of the proxy. (And as with most things, UAs will likely want to provide enterprise policies or other configuration options to fill in gaps.)

This is a very interesting idea, thanks for raising it! It would neatly solve the problem. I think that would require:

  1. Adding a treat-as-private-address directive.
  2. Adding some logic for combining multiple directives + the socket's address: we should use the most restrictive (i.e. most public) address space.

Of course, relying on proper tagging from the proxy means that users behind these devices will not receive the benefit of this protection until proxy implementers update their behavior. However, the overwhelming majority of these situations are almost certainly going to be enterprises and client security software, where there are at least some incentives to adoption. And if adoption does turn out to be a problem over time, UAs still have an opportunity to intervene against dangerous patterns by virtue of knowing that a proxy is in use and potentially alerting the user to the problem.

Good points.

letitz commented 2 years ago

Just to keep the state of this issue somewhat up to date: note that issues with the CSP-based solution have been identified in https://github.com/WICG/private-network-access/issues/62#issuecomment-947820931.