Open rcombs opened 4 years ago
Thanks for the feedback! (cc @letitz)
The current proposal really doesn't want to support unauthenticated connections to intranets from the internet. That's pretty explicitly something that the secure context requirement aims to break by making TLS a requirement for any server that wishes to expose itself outside of its network via the magic of mixed content.
The challenge you're faced with is indeed frustrating. DNS servers that block resolution to local IP addresses are news to me, and do complicate things for Plex. I'm not sure that the answer is giving you control over DNS resolution, however. There are a number of challenges there, both from a technical and policy perspective (consider an administrator using DNS to push folks to "safe mode" searches, a la safe.duckduckgo.com
, and so on). The application layer doesn't seem like the right place to do resolution.
As I mentioned on the bug, DNS resolution doesn't actually seem like what you want. If you know the hostname, you know the IP address, and you really just want an authenticated connection to that address... Both RTC and my least-favourite part of Web Transport seem relevant here. It seems worthwhile to consider how we can make those communication mechanisms compatible with this proposal.
+@yutakahirano and @vasilvv might have ideas around doing that for Web Transport. +@henbos might know the right folks to talk to for WebRTC.
It would be nice to have a well-lit path towards making HTTPS requests to localhost and private IP addresses that does not involve faking DNS results and/or self-signed certificates.
I wonder if regular fetches could be annotated with server certificate fingerprints like what Web Transport proposes (this is sure to please @mikewest :wink:)? This might be very similar to telling the browser "connect to that IP, treat it as domain name foo".
Speaking of self-signed certificates, could the Plex server fall back to using one of those? I guess if you're not actually navigating to the Plex server, and only making subresource requests, that would not work.
Re: @letitz: We've never considered self-signed (or non-publicly-trusted-CA-signed) certs a serious option, either for subresource loads (where they don't work at all) or for loading the app (where they [correctly] present a massive privacy warning). Additionally, I assume you mean for connecting directly to an IP address? Even if we were able to load resources in that way, the TLS would be providing no authentication, which would defeat half the point of it.
Server fingerprint annotation would probably work with the exception of while a cert has freshly been renewed and the client hasn't yet retrieved a fingerprint for the new cert (assuming we're talking about certificate fingerprints and not public-key fingerprints).
Solutions that apply only to Fetch would work for most of our use-cases, but still leave out media element loads (which are a major component for us) unless the same extension was provided for those as well.
Re: @mikewest, looks like you've seen my comment here re: RTC/WebTransport; I'll move over to this GH thread for any further discussion.
You asked over on the Chromium tracker for a design doc; we don't have any publicly-available summary of our whole setup (not because any of it's secret, but because we just haven't really had a need to write it all up before). I'll go ahead and explain the whole case here:
This deployment has served as an example for a few others like it. Western Digital now uses a similar approach, as does UnRAID (though I'm not sure if either connects directly to LAN addresses), to name a couple. Now that free Let's Encrypt certs are available (including wildcards), I think of this as a viable model for any application that has a central service that can exchange IDs (in the form of DNS names) and wants to allow secure communication between clients and servers on a LAN, with the exception of browser cases on networks with these hostile DNS responders, and loss-of-WAN scenarios (which we handle in other apps, but not in browsers).
The intersection of "no public address we can map ports on (or at least, no working hairpin NAT on the router)" and "router blocks DNS responses pointing at LAN" is surprisingly substantial and has been by far the largest headache in this deployment for years (though even if that case didn't exist, we'd have to implement most of the same workarounds in order to support loss-of-WAN cases).
Thanks for the detail, @rcombs! I understand that the cert itself isn't really the issue, but the potentially hostile posture of the user's DNS provider. As I noted above, however, what's "hostile" from your perspective is potentially "reasonably defensive" from other perspectives. Administrators use DNS to manage their network, and it's not at all clear to me that allowing web origins the ability to bypass DNS resolution is a great idea.
Regarding the flow you laid out above: one thing I'm not clear on is the relationship between the client app and the server. The server lives inside someone's local network: where does the web-based client app live? I'm asking because there's currently (see WICG/cors-rfc1918#1, however) no restriction on private->private communication, nor public->public. Do users connect to a centralized cloud service in order to view media from their local network? Or do they host a local version of the client app on their local server?
Solutions that apply only to Fetch would work for most of our use-cases, but still leave out media element loads (which are a major component for us) unless the same extension was provided for those as well.
I'm imagining here that it could be possible to take the stream resulting from something like Web Transport's datagram API, and feed it into the <video>
element's srcObject
for rendering. That would require someone to do some work to convert the ReadableStream
into some kind of MediaStream
(and I don't know those APIs nearly well-enough to know if that's possible from userland; I'll defer to someone like @henbos).
As I noted above, however, what's "hostile" from your perspective is potentially "reasonably defensive" from other perspectives.
Is this an issue if you're only allowed to override-resolve (or whatever you want to call it) something to a LAN address? I've only ever seen those blocked to protect against DNS rebinding, which this spec solves more cleanly. Theoretically these same cases can be bypassed by WebRTC or WebTransport (it's "just" massively more complicated), so it's hard to see how preventing regular HTTPS from working in those cases serves a network-security purpose.
Where does the web-based client app live?
It can be in a few places, depending on the user setup. The app users open in browsers is primarily loaded from plex.tv, which is convenient and secure. Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).
The other case is the TV app, which can be loaded from plex.tv, but is usually shipped as an installed app bundle on smart TVs and set-top boxes; I think that means the client address space is "local" and the context is "secure", so this spec probably doesn't affect it?
There's currently (see #1, however) no restriction on private->private communication, nor public->public.
Ah, the distinction between "private" and "local" is easy to trip over here. So that case would continue to work, but be insecure despite everything involved being technically capable of communicating securely, if only the client knew how to talk to the server.
So, there are some cases that would continue to work with these changes, but some that would break (most notably falling back on loading the browser app from plex.tv insecurely, which is an awful awful hack that I wish we didn't have to do), and several cases that are undesirable that would be great to have solutions for.
I'm imagining here that it could be possible to take the stream resulting from something like Web Transport's datagram API, and feed it into the
<video>
element'ssrcObject
for rendering.
Keep in mind that media files aren't simply streamed continuously; it's very common for the player to have to seek to read different parts of the file during normal playback (particularly during startup), and of course whenever the user seeks within the media. That'd be quite a bit of additional functionality on top of Web Transport (the bulk of an HTTP/3 stack, really).
Is this an issue if you're only allowed to override-resolve (or whatever you want to call it) something to a LAN address? I've only ever seen those blocked to protect against DNS rebinding, which this spec solves more cleanly.
The examples I provided are certainly external in nature (school admins protecting their users from inappropriate results, etc). I don't know enough about the enterprise case to understand whether similar DNS-based partitioning for internal services is a thing. I suspect it might be (and that @sleevi and @ericorth will know more about this world than I do).
Theoretically these same cases can be bypassed by WebRTC or WebTransport (it's "just" massively more complicated), so it's hard to see how preventing regular HTTPS from working in those cases serves a network-security purpose.
Bypassed iff both the client and server cooperate to establish a communication channel outside of DNS, yes. Presumably the services the administrator is pushing you away from wouldn't be incredibly enthusiastic about collaborating in unintended cases?
Also, this is kinda why I don't like either RTC or WebTransport's layer-piercing attributes. There's quite a reasonable argument to be made that we shouldn't ship these capabilities in those APIs either. :) I mentioned that in https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/mHV_ZALf07Q/d7J9W0a1CQAJ.
Where does the web-based client app live?
It can be in a few places, depending on the user setup. The app users open in browsers is primarily loaded from plex.tv, which is convenient and secure.
Got it, thanks. This would indeed be affected by the proposal we're discussing here, insofar as http://plex.tv/
would no longer be able to reach into local networks.
Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).
The proposal we're discussing here would not block this use case, either in secure or non-secure modes. You'd be in the same boat as the status quo.
The other case is the TV app, which can be loaded from plex.tv, but is usually shipped as an installed app bundle on smart TVs and set-top boxes; I think that means the client address space is "local" and the context is "secure", so this spec probably doesn't affect it?
Assuming that the set-top box is loading the app from itself (e.g. http://localhost/plex/app/
or similar), then it would be considered "local", and would be able to request "private" and "public" resources. http://localhost/
is also considered a "secure context".
So, there are some cases that would continue to work with these changes, but some that would break (most notably falling back on loading the browser app from plex.tv insecurely, which is an awful awful hack that I wish we didn't have to do), and several cases that are undesirable that would be great to have solutions for.
I agree. I'd like to find reasonable solutions here that maintain the guarantees we want to provide to users, and reasonable isolate their local networks from the web.
Some users also load the app from the media server itself (it hosts a copy), sometimes expecting to be able to communicate with other media servers on the same LAN from there. It looks like the latter case would continue working in the affected cases (but only in plaintext, which is undesirable).
The proposal we're discussing here would not block this use case, either in secure or non-secure modes. You'd be in the same boat as the status quo.
Indeed, if the webapp is served from the media server on a private IP address, regardless of whether it is served securely or not, it can make requests to other servers with private IP addresses. Do note that if served insecurely, the webapp could not make requests to localhost - but it could make requests to the private IP corresponding to the UA host.
Suggestion: given the above, maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?
As Mike points out, we are considering extending the spec to forbid all cross-origin requests initiated by insecure contexts to private/local IP addresses: see https://github.com/WICG/cors-rfc1918/pull/1. If we did this, then the above suggestion would not work.
This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins? For an on-path attacker to impersonate such an origin, they would need to have breached the local network and used something like ARP cache poisoning, a substantially taller order than intercepting a request to http://example.org. The target would still have to respond OK to pre-flight requests.
As for generically solving the problem with authenticated connections to private network endpoints:
I just read through some of RFC 6762. I don't see how it would help the problem at hand, since IIUC CAs will not hand out certificates for .local domains? Seems to me the same fate would befall .home domains. Also did the .home TLD proposal progress any further than a draft?
This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins?
No, I don't think this is viable, and the least favorable of all options. The complexity calculation is inverted (it's easier, not harder), it's worse for users ("secure means secure, except when it doesn't"), and it further promulgates the notion of a security boundary that largely doesn't hold; that is, the local network is less secure, which is part of why this exists in the first place. This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.
To the problem at hand, as a Plex user and lover, I definitely want to make sure to understand the scenario. Having local devices block public DNS resolving to local names is, unfortunately, nothing new here, so I totally understand and appreciate the problem space. I also appreciate that you're not really a fan of solutions other vendors have used, such as using WebRTC (or the to-be-explored WebTransport), since like Mike, I'm also concerned about the security implications of that.
What's not obvious to me is whether the remaining problem statement, namely:
Is that much different than what the Network Discovery API tried to solve. Which is to say, that it's not an easy problem, but also one with a lot of prior art in exploring the implications and tradeoffs involved.
reasonable isolate their local networks from the web
To be clear, I'm happy to have any solution require a very explicit opt-in from the server (I think exposing a CA-signed cert does a lot for this, but signaling via some sort of HTTP header or TLS extension or what-have-you would also be fine by me).
RFC 6762
Yeah, if we could get trusted certs for mDNS addresses that'd be excellent (at least in most cases; now and then we run into cases where multicast doesn't properly traverse a LAN…), but as far as I'm aware there aren't any plans for such a thing.
maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?
I suppose that might be the only option available to us if nothing else is done. Authentication when the origin is http://
This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.
The arguments against overriding addresses we've discussed have all been around network admin control, though; not user security. I don't think we've established any clear reason why overriding for LAN addresses in particular is unacceptable.
That cannot use a public DNS name that points to a local IP (due to asinine DNS rebinding implementations)
Well, asinine rebind-protection implementations are one case, but another is the no-WAN case, which can come up with a public-address origin in the case of a cached PWA, or with private-address cases regardless.
the Network Discovery API
We do have a multicast-based LAN discovery protocol, though it's been used less and less for years since cloud-based auth became available. Did that API ever allow for secure, authenticated communication?
I'd also like to emphasize that even for cases that currently are allowed, and would continue to be allowed with this spec, we currently have to use plaintext connections on LAN for no particularly good reason. I know a lot of people who don't care about this, and assume that LANs can be thought of as secure (and it's not easy to convince people that's not true); many argue that even requiring users to explicitly opt-in to HTTP fallback on LAN is an excessive UX burden. I've tried to argue for years that using TLS shouldn't be considered an excessive requirement, but as long as these cases exist, LAN TLS in browsers is always going to be flakier than plaintext, and we'll continue to have users and support staff complain about TLS requirements. This is where a lot of my frustration with the status quo comes from.
@sleevi:
This makes me think that maybe we should waive the secure context requirement for http://<literal private/local IP address>:port origins?
No, I don't think this is viable, and the least favorable of all options. The complexity calculation is inverted (it's easier, not harder), it's worse for users ("secure means secure, except when it doesn't"), and it further promulgates the notion of a security boundary that largely doesn't hold; that is, the local network is less secure, which is part of why this exists in the first place. This is the same reason extending fetch() to allow overriding DNS or TLS validation is also a non-starter.
Could you explain why it is easier for an attacker to impersonate a local and/or private IP address, and more generally why the local network is less secure? Not disagreeing, just curious.
As for "secure means secure", except when it doesn't", this waiver would not result in the Chrome omnibox displaying a "secure" badge, it would only allow those websites to make requests to other private IPs upon a successful pre-flight request. I do not think users would notice, so I don't think it would confuse them.
@rcombs:
maybe the right move when PMS (or https://plex.tv, not sure which component is responsible for this) detects that it is in the unfortunate intersection set is for https://plex.tv to redirect the browser not to http://plex.tv but to http://<media server's host:port>?
I suppose that might be the only option available to us if nothing else is done. Authentication when the origin is http:// is pretty fraught with peril and we've been trying to move away from encouraging it, though.
Glad to hear that there is a workaround to your problem! That being said, I agree that being forced to use naked HTTP in 2020 is sad. In my eyes it seems that authenticating in cleartext to a box on the local network is less dangerous than sending those credentials in cleartext halfway across the world, but that may stem from my above misunderstanding of the security properties of the local network.
Cleartext on LAN is almost certainly safer than cleartext over the internet (any attacker that can intercept your LAN traffic can also almost always intercept your WAN traffic), but it's still an added risk that we shouldn't have to take today.
Ok, so I've taken a deeper look at WebTransport, and I believe the idea @mikewest floated in https://github.com/WICG/cors-rfc1918/issues/23#issuecomment-689331056 is possible in Chrome today.
One can wire a WebTransport stream to an HTMLMediaElement using MediaSource. There is even sample code :partying_face:
It would require some work to support seeking. It seems to me that a fairly simple protocol and server could handle that. Especially so since when loading media from the local network, throughput and latency should allow for a naive implementation to perform well.
So I'd need a QUIC server (a task unto itself, though one I already was intending to do eventually, but not immediately under time pressure), with a custom protocol on top of that? Or, I guess I could implement HTTP/3 on top of QUIC within JavaScript, and shim that into a wrapper around Fetch? And then I'd need to have a certificate valid for no more than 2 weeks (vs my current certs, which are CA-signed and valid for 3 to 12 months), meaning I'd have to locally generate self-signed certs and build out infrastructure to store and distribute fingerprints to clients. Plus, I'd need rotate them weekly, generate each new cert well in advance of the current one's expiration with the notbefore/notafter timestamps front-dated with some overlap (1 week?), and make sure clients always accept fingerprints for at least 2 certs at any given time. And then if we wanted to support cases where users potentially have no internet connection for an extended period of time (we call this the "nuclear submarine case" after the real support request we've had about it), we'd have to either generate a long period's worth of certs well in advance, or give clients the ability to fetch the next valid fingerprint from the server over a connection using a current one… and advise users to make sure that every client connects at least once a week.
Like, there are parts of this that I'd be happy to do, but in practice I don't think this entire setup is realistic to ask all the relevant teams build to out, and certainly isn't going to be ready in a short period of time. What it would end up coming down to is just falling back to plaintext in all of these cases, which means giving users a button that ultimately says "make this problem go away", which they're going to click whether they're really in a case that needs it or not, and whether it's really safe or not. As long as I have to have that button available, there are going to be users who click it on a coffee shop wifi network and have their auth token immediately stolen over plaintext HTTP, and preventing these attacks was supposed to be the whole point of these requirements to begin with.
I understand your position - it is certainly no small amount of work to work around the restriction in this way.
What about the aforementioned workaround of redirecting your users to http://<media server's IP> instead of http://plex.tv? Insecure local websites are not subject to any more restricitions under CORS-RFC1918 than they are now, and a locally-served page can embed content fetched from secure public websites.
Hi @rcombs, any thoughts on my previous comment?
That'll probably have to be our solution if we don't come up with anything else, it's just also a really bad route to have to take, since it means giving people options (options that they really do need to take in supported cases, so they can't be behind too many layers of "here be dragons" warnings!) that send auth tokens over plaintext, even if "only" over LAN. It's not difficult to imagine situations where an attacker on a public network could arrange for the web app to believe it's in this situation and offer a fallback (or even situations where this happens completely legitimately while on a public network), which the user would, as we're all aware, most likely click through. So it's a "solution" in much the same way that early browsers allowing easy click-through on self-signed TLS certs was: sure, the user's immediate issue is solved in the legitimate case, but it also defeats the security measure to a substantial extent.
I see. Compared to the status quo I view the change as security-positive. Whereas before the user would send authentication material in plaintext over the public internet, now that material is sent over the local network. This seems like a strict reduction in attack surface. Is the issue that the local server requires the user to enter username and password again, instead of accepting the cookie obtained from https://plex.tv?
I'm not sure I understand the scenario you refer to. A user is in a malicious coffee shop and has their media server with them, then connects to https://plex.tv. The coffee shop owner, i.e. the attacker, has configured their router to drop incoming DNS responses mapping to private IP addresses. The webapp determines it is in such a situation and falls back to http. The coffee shop owner then serves a fake media server phishing website at the target IP address (gleaned from the DNS response)?
There are a few tricky things here that overall probably make it a lateral move security-wise, making some things very marginally better and some things distinctly worse, while being unambiguously a UX downgrade.
We currently don't actually send any auth information over the public internet in plaintext in these cases; the only thing loaded in plaintext is the webpage itself (solely for the purpose of not triggering mixed-content restrictions, which here force us to be less secure!), while all actual private communications with the service are over HTTPS; auth data is stored in localStorage rather than cookies. This means we're protected against passive MITM on WAN, but not active WAN MITM that, say, injects modified code that uploads the contents of localStorage somewhere nefarious. Switching to pointing the user's browser directly at the LAN server they're trying to connect to would solve that particular attack vector.
However, it would also mean that the user would have to authenticate to the server. Previously this meant actually entering their Plex username and password, which we've moved away from because it meant encouraging users to enter their passwords on sites other than our own (and obviously wasn't password-manager-friendly). Now, we instead use a SSO mechanism, where the user clicks "sign in" on the server's copy of the web app, and it directs them to a page on plex.tv where they're prompted for whether they want to sign the server in question into their account. However, once again we're met with a UX challenge: when you're connecting to a server like this on LAN, this prompt can't be too alarming, nor too difficult to click through, since it happens during normal usage! So we're ultimately forced to make it easy for users to effectively give access to their account to any device with the same IP address they're connecting from, which puts us in yet another position of having to (whether we want to or not) train users to take an action that's potentially dangerous, because we can't distinguish the safe case from the attack case.
There's also a significant UX problem with this model: we'd have no way of knowing whether the server is actually accessible over plaintext HTTP before navigating the user's browser to it. This means that if the network situation turns out not to be exactly what we expected, we'd end up sending users to browser "This site can't be reached" messages, and have no way of knowing when that's occurred.
The coffee shop owner, i.e. the attacker, has configured their router to drop incoming DNS responses mapping to private IP addresses.
This can also be anyone else in the coffee shop, or even a device left there connecting to the network; you don't need to own the network to perform a basic ARP poisoning attack and take an active-MITM position on any other user's traffic.
The webapp determines it is in such a situation and falls back to http. The coffee shop owner then serves a fake media server phishing website at the target IP address (gleaned from the DNS response)?
The attacker in an active-MITM position can simply handle any SYN to any address on port 32400 (the TCP port the media server runs on) by connecting it to a malicious HTTP server. We have some mitigations that attempt to prevent the client from prompting to fallback on plaintext HTTP when it's not actually on the same network as the user's server (based on public IP address), but they can't be completely robust against these cases; apparent IP addresses are not an authenticator.
This doesn't even get into how easy MITM can be on home LANs (where fallback is intended under this model!), between open wifi networks, ones with default passwords, and easily-infected IoT devices. Honestly, I would consider a system that can be compromised by an attacker on the local network to be overall more concerning than one that can be compromised by an attacker at the ISP (though nearly all cases of the latter we're describing here are also cases of the former).
So yes, navigating the browser to the local server by IP address is a solution, but I maintain that it's a dangerous one that provides little if any overall security benefit over the status quo, worsens the user experience, and continues to be vulnerable to a variety of the kinds of attacks that browser security restrictions attempt to address in the first place, so a proper solution for these use-cases is still needed if our goal is to protect users from attackers.
I'd like to reiterate how the primary objection (at least, the primary one that still applies even in the most restricted designs) that I've seen to the solutions I've proposed to these problems have been around how they would allow a user or a website to bypass controls put in place deliberately by a network administrator… but all of the workarounds people have suggested (WebRTC, WebTransport, navigation by IP address) allow those exact same controls to be bypassed. Is there any reason why plain HTTPS, with its well-developed tooling and support in browser APIs and media stacks, can't have the same local-network capabilities that all of these other methods enjoy? If so, I can't think of it, and haven't seen it articulated by anyone else.
Is there any reason why plain HTTPS, with its well-developed tooling and support in browser APIs and media stacks, can't have the same local-network capabilities that all of these other methods enjoy? If so, I can't think of it, and haven't seen it articulated by anyone else.
The difference here, vs say WebTransport or WebRTC, is that those methods have explicit opt-in models for the target, which this proposal is actually aligning with, and unlike your plaintext example, the connections are encrypted and authenticated in a way that can prevent both the passive and the active MITM scenario (to some extent; there are other issues with those methods)
The difference here, vs say WebTransport or WebRTC, is that those methods have explicit opt-in models for the target
I suggested providing a TLS extension earlier; you could even piggyback it on ALPN or the like. Or it could even be an HTTP header much like the ones this proposal adds.
unlike your plaintext example, the connections are encrypted and authenticated in a way that can prevent both the passive and the active MITM scenario
To be clear, I don't want to use plaintext under any circumstances; I want to provide robust encryption and authentication to all users in all cases. I'm just currently left with no other practicable choice under these conditions.
I suggested providing a TLS extension earlier; you could even piggyback it on ALPN or the like
OK, so just to make sure here: you’re talking about the server fingerprint annotation approach, combined with the above, right?
If that’s the case, then WebRTC and WebTransport both have their own issues there, and separately being dealt with. The annotation approach rapidly defeats the authenticity/confidentiality guarantees (e.g. by allowing shared keys among all users), which can then defeat mixed content protections or other important protections. There’s ongoing work with those to try to address some of the security issues caused by allowing the application to decide how to auth the origin / bypass the origin authentication provided by the browser.
Hmmmm, so to go over what that would actually look like real quick:
https://[IP]:32400/[path]
This should be doable. My main concern is around renewal; I see 3 possible solutions here:
Given the choice, I'd pick the first option, as it'd be by far the easiest for me to implement. None of these options is quite as simple as telling the browser "I expect the target to have a valid cert for [domain] under the browser's trust system, and the server can affirm that it's okay with this DNS bypass by opting-in with a TLS extension", but they would all be massively better than anything available right now, and I'd be thrilled to implement any of them. (Plus, this would probably be easier to roll out for anyone who doesn't already have centralized issuance of browser-trusted TLS certs to all their devices.)
My only other concern is that this only provides support within Fetch, and not in the native media APIs (
I've just realized that the rotation situation can be simplified a bit for implementers by actually using SNI as-is rather than a new extension that would require new infrastructure. For instance, 88cf03426602b5010fb0ad6963ef984ea382b906603cf0c6fac242c72bcfde20.sha256.print*
(using the illegal print*
TLD, as *
is guaranteed to never appear in actual DNS usage, but can be conveyed by SNI just fine); alternately, a reserved TLD could be used. There is no need for the server to actually present a certificate with a CN or SAN that covers that domain (and it'd be impossible to produce), only one with the specified fingerprint.
Does anyone have any thoughts on the fingerprint-annotation Fetch extension outlined above? I'd really like to get this case addressed.
It’s one of the options we’d previously explored, but unfortunately ruled out. There are technical issues with your specific proposal (e.g. *
is also arguably invalid in SNI as well), but the general idea of the fingerprint approach runs into the same issues I alluded to.
I don’t have much to add beyond that, unfortunately. It’s definitely an area of active interest and exploration.
The annotation approach rapidly defeats the authenticity/confidentiality guarantees (e.g. by allowing shared keys among all users), which can then defeat mixed content protections or other important protections.
If the alternative is forcing people to fall back on loading the entire application, JS included, over plaintext HTTP, is that really any better? At some point, trying to enforce rules against secure apps doing insecure things does more harm than good, when the only way to opt out of the restrictions is to completely abandon any semblance of security whatsoever. We're never going to web-policy our way out of people sometimes writing vulnerable apps, and right now all this is doing is preventing people who do want to write secure apps from having the tools required to do so.
@rcombs a quick clarification -- I'm having trouble understanding why a Fetch-with-opt-in-cert-fingerprint is workable for Plex but WebRTC/WebTransport certificate fingerprints aren't. I understand that the latter introduces substantially more engineering work for you but in a world of infinite time/resources would the two options be equivalent to you? (asking for the sake of understanding your requirements)
There'd likely be a few issues with WebRTC/WebTransport around embedded browsers not supporting them, possibly around local discovery for WebRTC, and neither has an obvious way to apply to the native media player infrastructure (which means all playback must be over MSE, which adds substantial latency and other limitations when playing on-disk content), but otherwise, yes, they're technically equivalent. So, theoretically those options might be workable (with limitations) in a world where I can successfully argue for infinite dev time on the project (which to be clear very few devs will do as long as "just load over HTTP" remains on the table)… Just, I haven't seen any convincing policy reason for those to be available, but a plug-and-play HTTPS API not to be.
Also worth pointing out, in re: the "could be used to get around mixed-content restrictions": half the contexts where this comes up are ones where those restrictions are disabled altogether. The current solution to this problem is to use plaintext HTTP connections, regardless of origin, because on embedded systems there is no web policy stopping anyone from doing that. You cannot prevent people in these situations from doing insecure things via web policy; you can only give them the tools required to build secure apps, and hope that they do so. If there were secure ways of handling these cases, maybe embedded browser engines would restrict mixed content, but that's not going to happen as long as these issues still exist.
I've recently learned more about Mixed Content, and discovered that mixed content is not blocked outright (at least in the spec). Carveouts are laid out for <video>
, <img>
and <audio>
. I wonder if your use case would fall within the <video>
and <audio>
carveouts? In which case, you would not run afoul of mixed content?
I feel like I'm missing something here.
@letitz Those carveouts aren’t considered permanent or goals, AIUI. @estark37 and @mikewest would know more, but my understanding is that these were always seen as incremental steps, particularly <video>
.
The challenge here, AIUI, is that the HTTPS may be served by a public IP or a private IP; in the event of a public IP, the HTTP to the local network will be blocked by this spec.
The only reason Plex sometimes serves their central webapp over HTTP is in order to make requests to http://<private IP>
without triggering mixed content restrictions, when silly routers block DNS responses mapping <something>.plex.direct
to said private IP. This workaround is now what's being restricted in Private Network Access.
I guess that kind of answers my question: mixed content must already be a problem, otherwise Plex could have just downgraded the target of the fetches to HTTP without downgrading the initiator to HTTP too.
Still, I'd love to confirm this has been considered. If the webapp could be served over HTTPS without breaking due to mixed content, Private Network Access would not bother it further.
@letitz correct, if that was the case then at least this wouldn't make my case any worse. Still wouldn't be a great position to be in, but wouldn't be outright regressing.
Mixed <video>
, <img>
, and <audio>
are blocked if they can't be autoupgraded to HTTPS (in Chrome, at least).
Thanks @estark37 for confirming. @rcombs I'm currently reaching out to other affected websites and will know more about their requirements soon. I'll circle back here once those are clearer.
@rcombs found me via Twitter and asked for help finding a way forward on this issue. Let me summarize what I see here:
fetch()
-based system could also require opt-in, but it would be more likely to be an easy option in the server, rather than a costly protocol implementation.fetch()
. Is it just that we want websites and their servers to pay more development effort in order to decide what's trustworthy? If someone publishes a re-usable library to do it cheaply, would we then be ok with shipping it in the browser?fetch()
had a mechanism similar to WebTransport's serverCertificateFingerprints.
<video>
and <audio>
had attributes to set the fingerprints too.Did I get all that right? It might be worth the Chrome-side experts (@mikewest, @sleevi, @letitz, and maybe @estark37) having a VC with @rcombs to be able to go through the issues faster than a long github thread.
Yup, I think that about covers it, thanks much for the summary.
- WebTransport and WebRTC are ok on this front because they require the IP-named service to opt in by implementing a complex protocol.
Unfortunately, not quite 😞 The complexity is almost entirely orthogonal. I think the question we’ve been grappling with is whether or not they are OK in the first place.
WebRTC was developed and emerged during a time when mixed content, or even just HTTP, was the norm. If you were a browser security engineer reviewing that spec, you wouldn’t be wrong for thinking WebRTC was a net-positive, because it was encrypted by default. The work on restricting to powerful features was still an area of active discussion. So the feature went in, and the full impact wasn’t really realized.
Flash forward to today, and we’ve made huge progress across the ecosystem. And what once was a net positive is now a bit negative. WebTransport comes along, and wants to do great things, and address some of the gaps in WebRTC for use cases (and due to WebRTC’s complexity). So it’s trying to fill an ecosystem hole similar to WebRTC, but is equally problematic, like WebRTC. Yet to further that goal of improving the use cases, we accepted to continue the hole for now, while WebTransport goes through incubation and validation of its approach.
I don’t know that we’ve fully accepted these as just the status quo, versus an area to improve and invest resources in improving, for better user security. Or I could just be the only one who hasn’t fully accepted it, and we just haven’t documented that yet 😅
That’s what makes this decision tricky. Given this is filling a completely different need, how we decide here effectively answers the above question. That’s because the use cases here are different enough that, combined with WebRTC/WebTransport, if we shipped it now, it would a significantly greater investment to fix after the fact, and significantly more challenging. That would make it de facto the status quo.
Alternatively, use cases like this are distinct enough that they can show that even if we accepted the status quo of WebTransport, there are still problems it doesn’t address. Having examples of those use cases, and what efforts they might block or get broken by, can then justify the investment and effort to try and solve these problems, whether through a fetch()
change or some other changes (like, say, MSE).
With WebTransport, we’re trying to do better than WebRTC, knowing what we know now and where the Web is at now regarding mixed content. That’s why there are things like the lifetime tweak in WebTransport, as we try to explore further restrictions. We would likely want any fetch()
API to provide those same restrictions, and then work to retrofit them into WebRTC so that they’re harmonized. If that approach doesn’t work, it’s good to know now, so that we can make sure WebTransport doesn’t unnecessarily diverge, at least given how fluid things are with that spec at the moment.
That’s some of the context for really wanting to make sure to understand the use cases more, as well as how different changes or efforts may affect a use case, or which may (inadvertently) break it, with there being no possible alternative.
Thanks for facilitating this discussion, @jyasskin. ISTM, as @sleevi is saying, one of the core issues we are stuck on is that we're not sure if we want to allow WebTransport to bypass the Web PKI (and if we don't, that implies we probably want to deprecate or rethink WebRTC's mechanism too). If we are okay with WebTransport's fingerprint mechanism, then I am not so opposed (at least philosophically) to extending it to fetch()
or whatever else. It's not clear to me what the next steps to decide on WebTransport are -- maybe we are kicking that can down the road until the WebTransport origin trial concludes.
While we (myself, @sleevi, @mikewest) all seem to think that bypassing the Web PKI is bad, it's also kind of an inherent design goal. Short-lived certificates help because it would ease the process of layering Web-PKI-like policies on top of the fingerprint mechanism, if we had a hook to do so in the relevant specs. For example, we could theoretically say that we're going to use CRLSets to "revoke" WebTransport fingerprints that are found to correspond to shared keys. As we introduce new such policies, it is easier to enforce them if developers are already in the habit of renewing their certificates regularly and thus can adapt to new policies. OTOH, the inherently private nature of what we're trying to do somewhat limits the efficacy of such policies because we don't, e.g., have Certificate Transparency to discover violations.
Another route we could take is to use this as impetus to pursue a "ceremony" approach that we've been tossing around for a long time ago but never really acted on. In this approach, we would bootstrap the connection from user interaction rather than from the publicly accessible Plex webapp. For example, the media server might display a password that the user enters into a browser dialog to establish a secure connection authenticated via a PAKE. This user interaction approach wouldn't work so well in a subresource context, I think (i.e., plex.tv loading a subresource from the media server) because it still has all the mixed content problems associated with busting out of the Web PKI. However, I think it would work reasonably well in a setup where the user navigates to the media server at the top level since the ceremony UX (e.g., browser password prompt) could potentially communicate that the user is moving into a different security context than the normal web. I'm curious what you think of this direction, @rcombs. Of course supporting such a protocol would entail some complexity on the Plex side and I'm not sure how that compares to everything you'd need to do to work with WebTransport.
My understanding is that the key problem with fetch()
is the way it's integrated with the rest of the Web infrastructure (i.e. if you fetch a resource with a pre-set key, you don't want it to go into the same cache as Web PKI). I am not sure there is a way to remove it from WebRTC, since, on some fundamental level, the point of WebRTC is for two laptops to be able to do video chat with each other, and I don't think there is a way to get either a Web PKI certificate (even if there were, I am not sure we would want every user device in the world to be in the public CT log).
The PAKE idea is interesting, and I think we already do something in context of Open Screen Protocol. @mfoltzgoogle knows more about that.
Top-level navigation to something hosted in the local server still isn't really a good solution here, since we'd be navigating "blind", with no way of knowing whether the page would actually load until we'd already lost our ability to do anything; besides that, it still leaves the user open to phishing (by encouraging them to authenticate at non-plex-owned origin addresses). It'd probably be a security improvement for a few specific cases where users already load the app from a LAN server, but we're trying to move away from that anyway. It definitely wouldn't fly on embedded systems (no sensible UX path), so we'd end up having to keep disabling mixed-content blocking there.
If there was a way to trigger a browser-provided PAKE UI from a JS UI, I'd imagine that'd be workable in desktop browsers, though the flow would be substantially more complex for users (requiring a pairing action for every new browser with every server for no clear purpose), and I'd be surprised if it ended up implemented on embedded systems at all. It definitely seems worthwhile for a lot of other use-cases, but I fail to see how it helps anyone in a situation where a centralized auth service is available at sign-in-time.
There's also the difficulty of "displaying" the PAKE passcode at all: in many cases, the only mechanism Plex Media Server has to "display" anything is to expose it on an HTTP API accessible via plex.tv, which of course is of very limited use to us here, particularly in the no-WAN case. It might be available from a non-Web client the user has on-hand? Or we might be able to write it to a log file viewable by out-of-band access? None of these are as user-friendly as the simple "display the code on an LCD" you'd expect on, say, a modern printer, or a software application that only runs on desktop computers with displays connected.
This keeps coming back to whether or not WebTransport should have a key-pinning mechanism, and I haven't seen a reasonable argument for why WebRTC should but WebTransport shouldn't… and if WebRTC shouldn't, then it's very difficult to imagine how WebRTC can exist as a useful technology at all (and I don't think anyone here is seriously arguing to get rid of it). If we want to come up with some way to tighten the restrictions on WebRTC (whatever that might look like) and then extend something similar to WebTransport/Fetch, I'd be happy to discuss that and comply with whatever's landed on, but I don't think it's too unreasonable to ask that the existing mechanisms not to be broken before we have a new solution.
Hi all, the PAKE approach you are describing is exactly what I'm currently perusing with the Local Devices API
. It takes the protocol work of the Open Screen Protocol and tries to make it available as building blocks in the JavaScript realm.
The spec is designed to operate without cloud services by default. This indeed means a user has to copy a passcode as part of the "ceremony". However, we can look into making this more user friendly by allowing you to hook into the system wherever the security model allows. I'd like to explore two ways that may work for the purposes if this discussion:
General references:
I am not sure there is a way to remove it from WebRTC, since, on some fundamental level, the point of WebRTC is for two laptops to be able to do video chat with each other, and I don't think there is a way to get either a Web PKI certificate (even if there were, I am not sure we would want every user device in the world to be in the public CT log).
There have been several promising explorations here, in the context of the browser having a more active role in session establishment (e.g. as folks explore conveying stable identity information for video conferencing). These are variations of the theme @estark37 was referring to, in which the browser is an integral part in ensuring the effective security policies are met.
But to that latter point, "every user device in the world to be in the public CT log" - that's not inherently a problem, and already devices like Plex, Western Digital, or even Google Cloud Shell ephemeral sessions (8 hour certs) are all part of public CT logs. So I wouldn't use CT as a justification in opposition to this :)
Top-level navigation to something hosted in the local server still isn't really a good solution here, since we'd be navigating "blind", with no way of knowing whether the page would actually load until we'd already lost our ability to do anything;
Hmm, yes, though you could probably partially work around this by opening it in a new tab and postMessage
ing to the opened page and closing it if you don't get a response after some amount of time. Not a great UX, though, because the user would see an error page.
besides that, it still leaves the user open to phishing (by encouraging them to authenticate at non-plex-owned origin addresses).
Could you pass an auth token or do an oauth flow between the public web app and the media server?
It definitely wouldn't fly on embedded systems (no sensible UX path), so we'd end up having to keep disabling mixed-content blocking there.
Can you clarify what you mean by embedded systems here? Do you mean e.g. an Android WebView in an Android app? I'm not sure that embedded browsers like that are realistically going to implement any of the things that we're talking about here.
I fail to see how it helps anyone in a situation where a centralized auth service is available at sign-in-time.
IMO the value in this setting comes not so much from the PAKE/ceremony itself but rather from removing the mixed-content aspects. And to some extent from introducing a browser UX mediating the navigation that better explains the security properties, though that's a tricky proposition.
Could you pass an auth token or do an oauth flow between the public web app and the media server?
We already do something along these lines, but an attacker could just as easily mimic a server instance and send the user through the SSO path, and we'd have very little way to tell that anything's wrong (we do some alerting when the IP address differs or in other suspicious-looking cases, but that isn't foolproof).
Do you mean e.g. an Android WebView in an Android app? I'm not sure that embedded browsers like that are realistically going to implement any of the things that we're talking about here.
I'm more referring to platforms where either all apps are essentially web pages (possibly with thin wrappers around them), or where that's the only way to deploy a cross-platform app. Smart TVs, game consoles, set-top boxes, Chromecast, that sort of thing. I wouldn't expect 100% deployment (nor would I expect it to roll out quickly), but if cert-fingerprint-annotation landed in the Fetch spec I wouldn't be surprised to see it make its way into some of those contexts. Even if the spec nominally called for a user prompt, there'd be a decent chance we'd be able to disable it there (much like how we currently disable mixed-content constraints in most cases). In theory, those platforms could provide us with platform-specific APIs for this, but having a standard would make it much easier to argue for. Amusingly, an Android WebView wouldn't be as much of an issue, since there we'd be able to bypass the browser networking stack altogether and call into a Java-provided fetch function (though we don't use web views for LAN interfacing on mobile platforms currently, and don't plan to).
an attacker could just as easily mimic a server instance and send the user through the SSO path
I don't think I understand this attack. Wouldn't the user have to authenticate the attacker's server via the PAKE?
I'm talking about a case where the attacker has directed the user to the SSO page for their server without going through a PAKE at all. The SSO service has no way of knowing that hasn't occurred.
Hmm, yes, though you could probably partially work around this by opening it in a new tab and postMessageing to the opened page and closing it if you don't get a response after some amount of time. Not a great UX, though, because the user would see an error page.
Naive question, does it have to be a new tab?
Although not yet available, I'm wondering if this could be a use-case for Portals which allows for another top-level document to live inside the same tab. The Portal would initially be off-screen / hidden, until its responds to postMessage. Note: potential caveats with regards to privacy motivated restrictions to communication channels (cross-origin).
We on the Chrome team (@estark37, @mikewest, @sleevi and I) have met to discuss this. We stand by the recommendations we made before, and can now offer a clearer reasoning for them.
Our suggested workaround is to navigate from https://plex.tv
to http://$PLEX_SERVER_IP
when the router blocks DNS responses pointing to $PLEX_SERVER_IP
.
We understand that this makes for a sub-optimal UX in case of failure, but in the common case it should work fine. We also understand that this results in a degradation of the user’s security for the following two reasons. First, it encourages users to initiate Plex SSO flows from untrustworthy origins, desensitizing them to the risk of being phished. Second, this puts https://plex.tv
in the slightly awkward position of having to decide whether or not to trust http://$RANDOM_IP
with auth tokens.
Another possibility is building on top of WebRTC, but that would likely represent a very large engineering effort, and a simpler solution is coming soon to the Web Platform. This brings me to my next point...
Our suggested solution is WebTransport and its certificate pinning mechanism.
We acknowledge that this represents a fair amount of work, but it should be significantly easier than building on top of WebRTC; our hope also is that some amount of the necessary investment gets implemented as reusable libraries (whether by Plex or someone else). We also believe it especially worthwhile considering the fact that both http://plex.tv
and http://$PLEX_SERVER_IP
are likely to lose access to more and more web platform features as the platform moves toward encouraging https use in stronger ways over time. Absent Private Network Access, this would likely be a wise investment anyway.
We expect WebTransport over HTTP/3 to ship in the medium term (it has begun an Origin Trial) with mitigations to protect against key sharing and other substandard security practices, including a) a short maximum expiration time for pinned certificates, and b) a browser-specific mechanism for revoking certain keys that have been subject to abuse. In the long term, we plan to try to port these restrictions back to WebRTC to align security models.
Remains the question posed by @rcombs a few times in this discussion: why not add the certificate pinning capability to the fetch()
API, if it is fine for WebTransport and WebRTC?
We believe that the distinction lies in the fact that WebTransport and WebRTC operate outside the origin model of web security, whereas fetch()
is firmly inside.
Browser features built on top of secure contexts and the same-origin policy assume that information sharing between two documents with the same https://foo.example
origin is acceptable, because foo.example
’s owner can share the information through the backend anyway. The browser uses DNS and the Web PKI to authenticate the origin’s owner, and then exclusively relies on the origin for security checks on cookies, the disk cache, socket pools, service workers, and more.
Allowing a different method of authenticating, either to override DNS resolutions or to use alternatives to the Web PKI, effectively requires creating suborigins for certificate identities. This is because state should not be shared between two (sub)origins with different owners, and in the absence of a global DNS or PKI the only proxy for ownership is certificate identity.
Creating per-certificate suborigins is not fundamentally impossible, indeed we previously had a project called - you guessed it - suborigins that did something similar but never got off the ground. Nevertheless it is quite complex, and we would much prefer to support Plex’s current use cases with Web Transport, which is simpler to reason about and doesn't require a complex, risky implementation effort in the browser.
I've just been informed about this spec proposal, and I noticed that you're citing Plex as a mechanism for doing TLS on LAN servers that public sites communicate with.
I'm an engineer at Plex working on the media server, and it seems like you're missing a key caveat that our current implementation runs into, which in my opinion need to be addressed before this feature can land in browsers in its current state, or else our case will break for a number of users.
I've explained the problem in this Chromium issue; here's a quick summary:
CORS-RFC1918 isn't the first attempted solution to DNS rebinding attacks. A number of consumer and commercial routers have a feature (often on by default, sometimes non-user-configurable) that blocks DNS responses that give A records pointing to LAN addresses. This completely breaks our HTTPS setup, and forces us to downgrade to plaintext connections to raw IP addresses, which (thanks to mixed-content blocking) also forces us to load the web app itself over plaintext HTTP.
This situation is extremely regrettable as-is, but with the current CORS-RFC1918 proposal this case would break completely, meaning users in that situation wouldn't be able to access their LAN servers via the public web app at all. Specifically, this change would prevent any plaintext-HTTP loads of LAN resources from public origins.
This wouldn't be an issue for us if we didn't need to make plaintext requests in the first place, and I'd much prefer to get to a world where we no longer have to rather than coming up with some exception to allow them. As detailed in the linked Chromium issue, we handle this in most non-browser clients (all those on platforms where we have full control over the TLS and network stacks) by skipping DNS resolution entirely. That Chromium issue asks for a way to do the same in-browser. This or some other mechanism to allow TLS connections to LAN hosts with known IP addresses without public DNS is necessary for our product to continue working on affected networks if plaintext LAN requests from public hosts are blocked.
I've wondered whether this might be solvable using multicast DNS as well, though I'm not sure if that really provides any advantage over just letting JS short-circuit name resolution.