Should a SXG document be considered SecureContext or not?

WICG / webpackage

Web packaging format

Other

1.23k stars 113 forks source link

Should a SXG document be considered SecureContext or not? #388

Open youennf opened 5 years ago

youennf commented 5 years ago

Say that liveness checks as described in https://github.com/WICG/webpackage/issues/376 are implemented and passing for a given SXG. It seems that the current document could be granted SecureContext.

Let's say that liveness checks are not passing. It seems that the level of security is not as high, which would mean that SecureContext should not be granted. Such variation may actually break content so it might be better to not render content to the user, and render the content fetched from the actual web site instead.

A consequence is that while the liveness checks can be done in parallel to processing of the SXG (subresource loading, parsing...), the liveness checks should be validated before the first page rendering and any JavaScript execution.

For privacy/security purposes, even subresource loading should probably be postponed until these checks are done.

sleevi commented 5 years ago

Could you explain why you don't believe that the level of security is as high? What are the properties that you think aren't met?

youennf commented 5 years ago

As discussed in https://github.com/WICG/webpackage/issues/376, if there is a buggy SXG, chances are high that attackers will use it as much as possible. Chances to get buggy SXG content is therefore bigger if you do get the SXG from somebody else than the publisher. The liveness checks hopefully make the level of security on par.

sleevi commented 5 years ago

Thanks! I was expecting you to say something about the key protection (which is similar to TLS session resumption, delegated credentials, secondary certs, and a host of other network-level things which may be invisible to the client), so that's a very different direction!

Would this same logic apply to resources a client caches on disk, unrelated to SXG? That is, I'm trying to unpack what the property is that the HTTP disk cache provides (as we seen to be comfortable with SecureContext), but that SXGs do not.

Similarly, I'm trying to understand a bit more about where "SecureContext" is reflective of the integrity of the transport to when it becomes about the 'security' of the content. For example, whether we'd deny SecureContext for certain CSP policies. I had always imagined SecureContext to be about the transport-level security properties and integrity.

youennf commented 5 years ago

Would this same logic apply to resources a client caches on disk, unrelated to SXG? That is, I'm trying to unpack what the property is that the HTTP disk cache provides (as we seen to be comfortable with SecureContext), but that SXGs do not.

I think there is a difference between the two. A buggy HTTP disk cache entry is under the control of the client and the content publisher, the buggy SXG entry is not under the control of the content publisher.

sleevi commented 5 years ago

A cache entry isn’t necessarily under the control of the publisher, is it? Especially once the buggy entry is cached, only a Clear-Site-Data activity would flush it, correct? Or is there some other element of control?

I’m trying to understand this concern more by trying to map to the world we have, because I’m not sure it’s clear the property that is both missing and critical enough to SecureContext to deny it for SXG, when we have much stronger signals of origin authenticity than TLS itself.

For example, in the world we have, we allow a site served over TLS to be treated as SecureContext, even though it may have been served by a stale CDN that doesn’t support Flushing. A site that “could” flush their CDN, but doesn’t, doesn’t seem fundamentally different from an SXG that “could” have a JS flush check, but doesn’t. Should we deny SecureContext to known CDN ASes, since we don’t know whether the content is fresh?

Understandably, there’s a tradeoff for the priority of the constituencies, but it’s not clear why the freshness would impact SecureContext, or where that threshold is.

Alternatively, it may be that “freshness” isn’t the essential property, but “voluntary distribution agreement,” since only CDNs with a private key associates with that domain can serve that content. If that’s the case, it seems alternative designs may exist to address the “relationship” property.

youennf commented 5 years ago

only CDNs with a private key associates with that domain can serve that content

Right, in such a case, there is a clear understanding that both parties are working together. Probably there are contracts between them, mechanisms for hot fixes...

With SXG, anybody can distribute the content. The spec can probably add a way for a content provider to 'whitelist' some distributors so that these liveness checks be made optional for these 'trusted' distributors. I am not sure we should go there.

sleevi commented 5 years ago

@youennf Thanks! I'm wanting to make sure I've got the problem well-framed enough to explore solutions. I mention this, because a liveness check is so critically disruptive to the privacy and performance properties of SXG that it seems like it would be a significant step back for a number of use cases.

It sounds like your primary concern is around "A bug could be introduced in the content shipped in an SXG", is that a fair (although grossly oversimplified) summary? If it is, could you help me unpack a bit more the type of "bugs" that would be concerning? Naively, I would get the impression that this is only concerned about scriptable content, but perhaps it's being seen to generalized to other types; for example, would an SXG of a CSS file be problematic? What about for a PNG file?

The concern - of a bug - sounds very different than the properties that SecureContext is meant to assert or guarantee. The client doesn't know about the relationship between the CDN and the Origin, for example, so at best, it merely seems to be an assumption that bugs "could" be fixed, not necessarily that they're prevented, purged, or otherwise managed. Given that SXGs (for scriptable content) have the ability to hotfix 'bugs', it seems similar to the status quo.

I think it might be useful to compare with OCSP for checking certificate revocation. No UA has ever denied SecureContext or deferred processing content if an OCSP check does not succeed, which is (effectively) a "liveness check" for the TLS certificate. The closest that we got was Opera rendering the content, but degrading the UI, but they moved away from that. A liveness check for SXG would seem functionally identical to an OCSP request. Is there some property different for SXGs and TLS certs worth also capturing here?

jyasskin commented 5 years ago

I'm curious about the distinction @youennf's making between liveness checks on signed exchanges (which he wants) vs liveness checks on certificate validity via OCSP (which @sleevi's saying Safari doesn't do). Is the important distinction that we trust people to protect their keys better than they avoid writing XSS bugs, or something else?
The discussion of "SecureContext" would make more sense to me if it talked about whether signed exchanges should be considered same-origin with the same content retrieved over TLS. For example, an XSS exploit is likely to do more damage by leaking localStorage or IndexedDB information than by calling one of the many fewer SecureContext APIs. [SecureContext] APIs also often involve a permission prompt, and it'd be more plausible to delay those until we've fetched validity information, than to delay rendering or access to localStorage. So, did you mean to only talk about SecureContext, or to address the Origin of the signed content as a whole?

youennf commented 5 years ago

for example, would an SXG of a CSS file be problematic? What about for a PNG file?

I am mostly concerned about navigation loads. Once you have a document, other mechanisms like SRI can be used if need be.

I'm curious about the distinction @youennf's making between liveness checks on signed exchanges (which he wants) vs liveness checks on certificate validity via OCSP (which @sleevi's saying Safari doesn't do)

IIUIC, there were performance issues that make things difficult with OCSP checks. Liveness checks though are targeting the content author web site. This web site has an incentive to deliver the content fast. Chances are high that liveness checks will be fast as well.

2. The discussion of "SecureContext" would make more sense to me if it talked about whether signed exchanges should be considered same-origin

In the case where signed exchanges are fetched same-origin, this is business as usual, probably no need for additional checks. When they are fetched from a cross-origin distributor, it seems the goal is that the generated document will have the origin of the content author, not the distributor origin.

2. many fewer SecureContext APIs.

Consequences with APIs like payment API might be bad. Ditto for getUserMedia: for some websites, the user might not be prompted at all. Or microphone output is sent to someone unauthorized during a call.

2. it'd be more plausible to delay those until we've fetched validity information, than to delay rendering or access to localStorage.

That might be feasible for some APIs but would add quite a bit of complexity. For other APIs, like service worker, that seems pretty difficult.

a liveness check is so critically disruptive to the privacy and performance properties of SXG that it seems like it would be a significant step back for a number of use cases.

That is something that would be good to better understand. Maybe the use cases require different solutions or complementary solutions. I agree performance is important but there might be tradeoff with security, HTTPS is slower than HTTP for instance. Safari is implementing cache partitioning which has a theoretical perf hit. As of privacy, if a page gets loaded, it will be easy for this page to trigger a load to its server so as to identify the user. Of course, there could be workarounds, like disabling the possibility for such pages to do any networking, but that might prevent any opportunity for the web page to implement hot fixes.

sleevi commented 5 years ago

I am mostly concerned about navigation loads. Once you have a document, other mechanisms like SRI can be used if need be.

Sorry, now I'm even more confused, in trying to understand the principle or goal you're trying to capture from not treating as SecureContext.

If it was about the transport properties, such as TLS, as it's used today, it would seem like it would matter equally regardless of content. You mentioned being concerned about bugs in the content, which seem possible there as well.

However, the later parts of your reply leave me wondering whether the goal is to restrict access to certain APIs, and using SecureContext as a way of avoiding enumerating them individually / playing whack a mole. Going back to the original remarks, you mentioned that it seems that the level of security is not as high, but I'm not sure we've quite nailed down what that property is.

This web site has an incentive to deliver the content fast.

I don't think the data supports this conclusion, as practiced today. But I also have concern that it's a bit at odds with the goals of improving distribution for Web developers and for users in emerging markets.

HTTPS is slower than HTTP for instance

While not the issue for this, it would be useful if data could be shared on that. In many cases, we've seen HTTPS faster for users; whether through enabling new protocols (H/2 or QUIC) or through avoiding network (mis)management.

As of privacy, if a page gets loaded, it will be easy for this page to trigger a load to its server so as to identify the use

I fear that may be overlooking significant use cases. I think one which we've heard from a number of developers is the idea to effectively prefetch or preload content in the background, to enable quick and efficient rendering. Using prefetch or preload, as they are today, reveals the user, just as much as liveness checks would. If this was aggregated across several domains, the act of that prefetching may reveal information about the content the user is viewing on the Distributor - for example, observing liveness fetches to a.example, b.example, and c.example may indicate you're viewing content related to those three domains.

Specifying a liveness check would introduce that same privacy risk, making it unlikely for Distributors to serve that content (versus, say, self-hosting it, as some may do today). Even restricting SecureContext can have impact - for example, in causing loads to be treated as mixed content and blocked (in the Distributor case) or allowing features that should be blocked (e.g. HTTP loads or WS).

This is why I'm so keen to understand those security properties we're talking about, and how we quantify them, in order to see if there are alternative, less disruptive solutions, which both help developers and help keep the platform consistent.

kinu commented 5 years ago

it'd be more plausible to delay those until we've fetched validity information, than to delay rendering or access to localStorage.

Fwiw I do like to avoid this too unless we find it'd be really the only desirable path. As far as I know (at least in most cases) security state is determined upon navigation / document creation, introducing a new intermediate state between non-SecureContext and SecureContext and allowing transition between them seems to open up another complex problem space, possibly too complex. I agree with @sleevi that we should nail down the security properties first. I also agree that the primary concern, i.e. bugs. seems something different from the property that SecurityContext is meant to guarantee. Let me also /cc @mikewest reg: SecureContext

youennf commented 5 years ago

HTTPS is slower than HTTP for instance

While not the issue for this, it would be useful if data could be shared on that. In many cases, we've seen HTTPS faster for users; whether through enabling new protocols (H/2 or QUIC) or through avoiding network (mis)management.

Sure, the point is that HTTPS initial objective was probably to be right in terms of security as well as reasonably efficient. Follow-up efforts made it even better. I would tend to do the same here.

you mentioned that it seems that the level of security is not as high, but I'm not sure we've quite nailed down what that property is.

Let's have a try.

A website has a security issue related to a particular resource. The web site is using things like proxy-revalidate to only need to care about client caches. The web site decides to wipe client caches conditionally on some client-side cookie parameter. The client loads a web page, the server checks the cookie and includes in the response a Clear-Site-Data HTTP response header to wipe out the whole client cache. The web site also updates the client cookie. If needed, the page also does a ping to the server to trigger Clear-Site-Data/page reload.

In a world without signed content, this might be good enough. In a world with signed content, this might be broken.

After the client cache is empty, an attacker makes the client download a signed content of the faulty resource. The ping to the server will not trigger any cache data clearing/reload.

According https://wicg.github.io/webpackage/loading.html, the signed content is not added to the HTTP cache (as some kind of a protection?). Let's say now the above web site is using service worker and Cache API. The faulty resource will be stored opportunistically by the service worker (its cache was cleared) and will poison the web site persistently.

I also agree that the primary concern, i.e. bugs. seems something different from the property that SecurityContext is meant to guarantee

SecureContext is about the trust you can have in a given document. As stated above, the spec does not seem to have enough trust in signed contents to update the HTTP cache with it.

We can also look at the HTTPS state of the resource. In the case a liveness check is done and succeeds, should the HTTPS state of the signed exchange be set to the HTTPS state of the liveness check? I would tend to say yes. In the case a liveness check is not done or fails for some networking reason, what should be the HTTPS state of the signed exchange? IIUIC, the spec says to use the HTTPS state of the signed content load. This seems ok if the provider and distributor are using the same connection. I am not sure this is ok if the provider and distributor do not share the same connection.

Let's say a content provider web site has a bad TLS setup. For the sake of argument, the user agent will set the HTTPS state to deprecated for any load to the content provider server. User agent will set the HTTPS state to modern for any load made to the distributor server. In a world without liveness checks, the user agent will initially load from the signed content and will set the document as secure. When reloading the page, the user agent will go to the content provider and make the document insecure. In a world with liveness checks, the user agent offers a consistent behavior before and after reload.

Specifying a liveness check would introduce that same privacy risk, making it unlikely for Distributors to serve that content (versus, say, self-hosting it, as some may do today).

Note that self-hosting also causes some privacy issues since the distributor might know all the newspaper articles a user is reading, and not only the first one if user is navigating to the content provider.

Agreed on the principle that the privacy implications need to be cautiously evaluated. Prefetch/preload/liveness check should probably all be without credential. If the liveness check is done at load/navigation time, there is nothing lost compared to regular loads in terms of privacy.

Some flexibility can also be left to user agents in the way they implement liveness checks. A user agent may preconnect to the provider web site as soon as possible. If done at navigation time, Service Worker may come to the rescue. Bundling might be also something that could be used to mutualize the liveness checks. Another user agent may send the liveness check earlier, based on some user specific knowledge or interaction heuristics.

sleevi commented 5 years ago

According https://wicg.github.io/webpackage/loading.html, the signed content is not added to the HTTP cache (as some kind of a protection?)

The reasoning for this isn't a security protection, it's about the privacy aspects. Those privacy aspects may be addressed/addressable in the context of double-keyed caches; however, since such work is not normatively specified right now, and there are UAs that don't double key, we took an approach to maximize privacy (in many of the design elements)

This same focus on privacy is why liveness checks are deeply concerning; as shown with OCSP, liveness checks fundamentally harm efforts to protect user privacy. The design goal has been to try to ensure that SXG not only does not introduce any privacy issues from the status quo, but to also take opportunities to improve it, where they exist.

@jyasskin Would it make sense to capture this in the draft privacy considerations or as an explainer, perhaps? Namely, to capture some of the explicit design goals for (privacy and security) that contributed to the current design? It doesn't quite feel right in the spec, but it seems like it'd be useful context for folks reading to understand "Why X, not Y?"

Note that self-hosting also causes some privacy issues since the distributor might know all the newspaper articles a user is reading, and not only the first one if user is navigating to the content provider.

Can you please explain how this would be? This only seems like it would be possible if the Publisher actively collaborated with the Distributor, by providing Distributor-specific SXGs in which all outbound links (e.g. to other articles of the Publisher) instead explicitly specify Distributor SXGs. In such an 'active collaboration' model, it's unclear whether this is a change from the status quo - the Publisher could do this via Pings or back channels, right?

Prefetch/preload/liveness check should probably all be without credential.

I suspect we may have differing understandings of the processing model for SXGs as proposed, and the privacy properties we're trying to achieve.

It sounds as if your focus is on UA-provided information to the Distributor, such as credentials or other headers. However, a big concern on our end has been both the Publisher learning about the Distributor, and those on the network learning about activities on the Distributor.

The problem with liveness checks is that they undermine privacy in ways similar to XS-Search, even in a credential-less fetch. For example, consider if distributor.example/page1 were to preload publisher-1.example/, publisher-2.example/, publisher-3.example/, while distributor.example/page2 were to preload publisher-a.example/, publisher-b.example/, and publisher-c.example.

With a liveness check, credentialed or not, a network observer would be able to determine whether or not a user is on distributor.example/page1 vs distributor.example/page2 by observing whether or not requests were made to publisher-1.example vs publisher-a.example.

Similarly, on the publisher side, a publisher that received a liveness check to publisher-1.example would be able to know - at the time of preload/prefetch - that the user was was looking at distributor.example/page1 if they knew about that association.

As we've seen across the Web ecosystem, privacy-conscious distributors are concerned about these sorts of side-channels - to both network observers and to less privacy-conscious publishers - and so take steps, such as rehosting content same-origin to prevent these sorts of side channels. SXGs are a means of achieving those privacy-preserving properties, while allowing meaningful and accurate attribution to users.

Hopefully this captures more clearly why liveness checks are fundamentally hostile to user privacy, and why they've been an important design consideration throughout. One would expect that the first steps a privacy-conscious browser or extension would take would be to disable them (or disable SXGs), both of which would then result in even worse consequences for the ecosystem, especially around privacy and authenticity of content.

I'm hoping we can find alternative solutions that achieve that same goal, which is a critical use case here.

SecureContext is about the trust you can have in a given document.

We've tried very carefully to avoid overloaded terms like 'trust', which can mean varying things for recipients. In the context of the objectives that the SecureContexts spec sets out, our view has been that it affords an appropriate level of confidentiality, integrity, and authenticity. Much of the threat model and considerations address this in the context of whether or not an origin has been authenticated.

As it relates to SXGs, I hope we've got agreement that the integrity and authenticity properties have been sufficiently maintained and are equivalent to that of TLS. The SXG spec notes that there are trade-offs with respect to confidentiality/privacy are noted in the privacy considerations.

I think we want to be careful about 'trust', because a definition that also includes "trust that there were no bugs", or "trust that the content is honest or accurate", or more broadly, "users can trust this", require a lot of unpacking and have differing expectations. Much like TLS doesn't and shouldn't guarantee that the content is 'trustworthy' - merely that it was delivered over a connection with C/I/A properties - we have tried to avoid introducing those more subjective and problematic elements of trust.

youennf commented 5 years ago

Below some more thoughts related to discussions we had with Jeffrey, Kouhei and Yoav during IETF 104.

With a liveness check, credentialed or not, a network observer would be able to determine whether or not a user is on distributor.example/page1 vs distributor.example/page2 by observing whether or not requests were made to publisher-1.example vs publisher-a.example.

True. That said, web packaging on its own is not sufficient to prohibit this to happen in a browser: reloading the page, clicking a link, the page loading some resources (high resolution images/videos)...

Currently, I think the benefits of a live check outweight its drawbacks, in the context of a browser. Other contexts, other additional technologies, other ways to do these kind of validations, might change things in the future.

sebastiannielsen commented 5 years ago

I think SecureContext should be granted provided a liveness check was done on the domain and passed, for a maximum of 30 days ago.

The reason, is that the minimum RGP ("Redemption Grace Period") according to ICANN, but are, by most registrars called a "quarantine" period, is 30 days. This means, that we can be absolutely 100% sure, that a specific domain is in possession of the domain holder (or someone related to him - note that voluntary transfers are counted as "related"), atleast 30 days counted from the last liveness check.

Only past the 30 days there is a possibility that the domain have been aquired by a unrelated third-party. (For example, if a liveness check was done the second before the domain expired its renewal payment, after 30 days the domain would be free to be registred by someone else, whose might be affected by a signed SGX with regards to cross-origin)

sleevi commented 5 years ago

I’m not sure it’s clear why the ICANN policies would relate at all to the liveness check. That seems largely orthogonal? In the event of a domain registration change, the SXG certificate will have been revoked, or can be by the new holder, by virtue of the existing rules for certificates (e.g. the Baseline Requirements, Section 4.9.1.1).

sebastiannielsen commented 5 years ago

The reason the liveness check should relate to the ICANN policies, is that within the grace period, we can, security-wise, know 100% that the content is trusted and by the domain owner.

Away from that, the domain could potentially be by a new owner, and since automated CA's don't check whois for the domain expiration date, there will be no revocation, unless the new domain owner explicitly request it. He might not even be aware of the existence of the old cert.

A TLS certificate for a domain who have changed owner, is of limited use for the old owner, as he would need to in some way capture or redirect traffic to his TLS server, and the 60 days left of the certificate, in worst case, also helps mitigate this issue.

A SGX certificate however, is worser, as the signed content could be distributed, which creates a security hazard as CDNs propably will not do any further validations on content which is signed by the original domain owner where the SGX signature is still valid. Thats why I propose tying the liveness check to ICANN policies, as the ICANN policies are minimum, no registrar ever are allowed to go below 30 days quarantine time.

sleevi commented 5 years ago

I don’t believe we can state “100%”, unless we’re limiting the threat model to ONLY this specific attack. Note also that the ICANN policies only apply to a subset of TLDs (gTLDs), so it also does not provide a margin there.

However, it does seem as if the concern is a misalignment between the lifetime of the assertion and the lifetime of the domain registration, which does seem as if it is a new concern (compared to those raised earlier on the thread). I am curious whether a reduction in the certificate lifetime itself (e.g. from 90 days to 30) would be sufficient to mitigate that concern, as an alternative to liveness checks.