domenic / get-originals

A web platform API that allows access to the "original" versions of the global built-in objects' properties and methods
28 stars 1 forks source link

getOriginalConstructor, callOriginalMethod, and lots of others mean significant (and at least sometimes insuperable) trouble for Caja #12

Open jswalden opened 6 years ago

jswalden commented 6 years ago

If you want one of the variants of secure JS and are unwilling to rewrite source text to sanitize it of access to a certain set of global names, you need to prevent access to Function. (And maybe eval, tho maybe that's turn-offable with CSP now, depending whether you care about supporting only CSP-supporting browsers, the market share of which I truly don't know now.)

But if getOriginalConstructor is a global non-writable non-configurable property, any hostile "secured" code can just get the original Function constructor (or for that matter, any of myriad DOM constructors). And with callOriginalMethod, such code can do any of the active-operation operations this proposal would provide using that constructor and the fruits of it, and you have effectively escaped the restricted-operation set that a JS sandbox would want to provide.

I'm not sure what state of the art is for JS sandboxing these days. But from my recollections of having observed some of these things in the very distant past (~2007-2008), and vaguely dealt with modern Caja-style things in passing since then, significant portions of what this proposal wishes to offer are, to put it mildly, Problematic.

jswalden commented 6 years ago

I imagine @erights will have a thing or two to say about this proposal. :-)

ljharb commented 6 years ago

Similarly, this allows polyfills to be bypassed by JS code that runs after it, even if the polyfill is correcting bugs or filling gaps in the implementation.

benjamingr commented 6 years ago

@jswalden secure code can run in a frozen realm - see confinement there.

At least, having read Mark's papers about security critical APis and the previous work that is my understanding. I'm sure @erights can weigh in on it.

That is, the whole patch globals for proving with escape-analysis that the code is secure approach will (hopefully) be superseded by a proper subset of JavaScript that the proof can be directly used on.


Similarly, this allows polyfills to be bypassed by JS code that runs after it, even if the polyfill is correcting bugs or filling gaps in the implementation.

I think that there is value in giving proper code the ability to "Escape" such polyfills. For example if someone for example patches Object.defineProperty for example for a WeakMap shim, that can have unmeasured impact on how we handle things at Node.js internally - namely we might be heavily relying on optimizations or specific semantics that the extension doesn't have.

domenic commented 6 years ago

Yes, this would cause issues for environments that have different security models than the web, such as ones that encourage running untrusted code inside your realm (or even process, these days). The Chrome team is OK with that. I suppose we should explicitly add that to the readme.

bzbarsky commented 6 years ago

To be clear, some of these environments are deployed on the web, right?

jswalden commented 6 years ago

secure code can run in a frozen realm - see confinement there

@benjamingr Okay -- so should I interpret this proposal as intrinsically tied to the success of that one, which is to say this one depends upon/assumes/requires it and this change won't happen without that one? 'cause if so I have more reading to do -- I just got roped into this from a JS engine implementer point of view (but had different questions to ask when I started thinking), didn't know of anything more to be reading.

domenic commented 6 years ago

No, we have no interest in tying this to any realms proposals. (Remember this is a web API proposal.)

FUDCo commented 6 years ago

Modern web sites incorporate large quantities of third-party code, and having these sites' security depend on each web site operator doing deep security audits of all this code is wildly impractical. Security researchers have already found third party libraries exfiltrating sensitive data to undetermined outside parties unbeknownst to the site operators hosting those libraries.

Thus, @domenic I don't understand your phrase "different security models than the web, such as ones that encourage running untrusted code inside your realm". The frozen realms proposal absolutely is concerned with the web. Nobody is encouraging running untrusted code, but as a practical matter nearly everyone is running untrusted code. The challenge is how to make this less dangerous. A web API that that actively obstructs or prevents site operators from implementing additional mitigations and protections seems ill advised.

domenic commented 6 years ago

We understand some people have different threat models. But, we still plan to ship powerful features and APIs to the web. If some sites are not able to audit or control the code they run to see what APIs that code is using, that's unfortunate, but they are already exposing themselves and their users to massive risk, and we won't hold the progress of the web hostage to such sites. As you've noticed, such things already occur, and we continue to work on important mitigations such as CSP, site isolation, and the like.

benjamingr commented 6 years ago

@domenic Chip makes a good point of allowing bypassing the one way we have to secure code now (by overriding things like here) without shipping another way to secure the sites.

Moreover, this potentially breaks sites that already run tools that create a sound sandbox. I totally want to be able to use the idea and code in this proposal - I think we need to resolve the issue Jeff and Chip raised though.

Given that we need to grab the originals (as a platform) in order to provide tamper-proof APIs and it is desirable for us to prove that the APIs are tamper proof (which this proposal makes easy) - what sort of mitigation do you see to the issue raised here @FUDCo?

Would an opt-out/opt-in be acceptable? Can you elaborate more on how people are using Caja and other sound solutions "in the wild"? (I've only used it once or twice and only in an academic research setting)

domenic commented 6 years ago

Sorry, I thought I made it clear: we don't believe we "need" to support such alternate security models on the web. Sites that are running untrusted code in the same realm cannot be "broken" by adding new features to the web, because they are already fatally insecure.

bzbarsky commented 6 years ago

I believe that last statement is just false, fwiw...

domenic commented 6 years ago

Can you elaborate? Does Mozilla have a different threat model in the post-Spectre world?

benjamingr commented 6 years ago

@domenic

Sites that are running untrusted code in the same realm cannot be "broken" by adding new features to the web, because they are already fatally insecure.

I think there is a misunderstanding on my part - I was under the impression that code running in Caja for example is proven to be secure (as in mathematically, by doing induction over formulas, like @erights does here).

I thought this was a large motivation for adding strict mode in the first place.

That is, I thought the whole point of tools like Caja was that you can run untrusted code in the same realm and prove that it is secure. One of the things that are done in order to do this is override some global properties.

All this might be wrong and if it is I apologize :) I would be potentially be interested in using this API in Node.js for our internal code (with automatic rewriting) anyway as we don't have a Caja (we run insecure code in isolates).

benjamingr commented 6 years ago

Can you elaborate? Does Mozilla have a different threat model in the post-Spectre world?

That's a fair point as you are observing things that were not taken into account as observed - tools like Caja do escape analysis if I understand them correctly (you can't escape your sandbox) but being able to observe things like branch-predition faults means you are potentially exposed to extra information with the APIs you have access to.

Maybe @FUDCo @erights or @bzbarsky can elaborate on what work has been done on Caja since Sceptre and whether or not formal-analysis was performed to address (or identify) "problematic" APIs and if they can be sure that pages that meet certain requirements for the escape-analysis can also mitigate things like Sceptre.

bzbarsky commented 6 years ago

@domenic Ah, good point about spectre. With my limited understanding of Caja, with spectre, the untrusted code may have read access to things via spectre but:

1) It may still not be able to exfiltrate the information it gets, depending on what APIs are exposed to it. In which case the read access does not matter. 2) It does not get arbitrary modification capabilities.

The proposal here would give it both.

bzbarsky commented 6 years ago

Or to put this another way, security analysis of spectre is premised on the fact that read means remote read because exfiltration is trivial. Which it is in the context of websites whose code you don't control, but not necessarily in a sandboxed environment.

domenic commented 6 years ago

Does Mozilla have APIs that make it possible to sandbox all JavaScript code running in your page's current process?

bzbarsky commented 6 years ago

Mozilla has no code in this space. I'm just talking about Caja.

domenic commented 6 years ago

Sure, but it would require some browser-level API that allows sandboxing all JavaScript code running in your page's current process for Caja to achieve either of the "but"s listed in https://github.com/domenic/get-originals/issues/12#issuecomment-395070109. If Firefox doesn't have such an API, it seems the threat model I've been saying that we hold for Chrome also holds in Firefox... Did I miss something?

bzbarsky commented 6 years ago

I'm not sure I follow. I believe (and maybe I'm wrong?) Caja is implemented by removing access to all browser built-ins by default (e.g. removing everything off all default prototypes, removing all constructors from the global, etc) and then passing the needed ones selectively to specific pieces of code as needed. If that's correct, why does it need browser engine support to prevent exfiltration?

bzbarsky commented 6 years ago

(I believe Caja also removes access to the global by default; this is why in strict mode JS you can't get access to the global via the (function() { return this; })() trick.)

erights commented 6 years ago

An Update on Frozen Realms in light of Meltdown and Spectre given at March 2018 tc39 meeting.

Explains that Meltdown and Spectre are attacks only on confidentiality, not integrity. Explains that the Caja, SES, Realms, Frozen Realms work etc, as well as historical work on E and Joe-E, never claimed to prevent side channels to those that can measure the duration of time. We do provide real security on the dimensions of integrity, and on denying dynamic non-determinism to libraries that do not need it.

Stopping Exfiltration given at May 2018 tc39 meeting

Further explains that the alleged "web security model" is not one that any major website (including Google) can or does practice. No one has the resources to trust even their own code, much less third party libraries. Cites real attacks that result, which the alleged "web security model" could not have helped with. Explains that a very large class of libraries, those doing transformational jobs, can be run in a deterministic manner, and therefore reduce the burden of vetting library code.

Extremely Modular Distributed JavaScript given at July 2017 tc39 meeting

Explains that the "security" vs "software engineering" distinction is largely a false dichotomy. Security, done well, is an extreme form of modularity. Both seek to enable the benefits of composition while minimizing the risks from destructive interference. If you engineer a system to be robust against intentional interference (i.e., attacks), you are also more likely to have built a system robust against accidental interference (i.e., bugs).

Verify What? Navigating the Attack Surface given at the workshop "Formal Methods meets JavaScript", Imperial College March 2018

Explains how defense in depth, done well, can bring about a multiplicative reduction in the attack surface as a measure of overall expected risk. We are not faced with a choice between process separation vs object separation as security models. Rather, we need both. Process separation can enable defense along confidentiality and availability lines in addition to integrity. Object separation does not help on availability at all, and provides only limited help of confidentiality (see Stopping Exfiltration above). However, it adds strong defense in depth on integrity.

domenic commented 6 years ago

@bzbarsky Unless Caja is running on everything in the process, other untrusted code in the process can read and exfiltrate the data held by supposedly "sandboxed" code in the same process.

bzbarsky commented 6 years ago

If you have "unsandboxed" untrusted code you lose, sure. But who's proposing doing that? The premise here must be is that browsers do their multiprocess thing as needed so that you know if there's any untrusted code in the process then you have chosen to run it yourself (this is a basic premise of site isolation, obviously, because without that there is all sorts of untrusted code you don't control in the process). If you then make sure you run untrusted code only in sandboxes a la Caja, can you explain where the problem is?

I'm also not sure why the comment @erights made was marked off-topic, since it's a direct answer to some of the questions asked in this thread about security models....

domenic commented 6 years ago

I see no way to sandbox all code in a process right now, which is why I was asking if there were browser APIs in Mozilla's security model that would give access to that ability.

bzbarsky commented 6 years ago

I'm really not sure what you're asking.

If such an API existed (and it does not), how would it get used in practice?

domenic commented 6 years ago

I think we're getting off in the weeds now, exploring the use of a second hypothetical API to implement a specific library's security model in the face of the original hypothetical API being shipped. Personally speaking, I'd like to take a step back here and disengage a bit.

If this issue becomes the last blocker for Mozilla in implementing the get-originals API, I'm happy to put our security folks in touch so they can explain the thinking in more detail. In the meantime, the proposal needs a lot more work in other areas you and others have identified, and I'll focus on those. Sorry if this is disappointing for anyone.

benjamingr commented 6 years ago

Thanks for the discussion everyone I've learned a lot.

@erights would be great to put those updates in a place interested parties would have an easier time following them :) I’m definitely going to enjoy this.

bzbarsky commented 6 years ago

I don't see why a second hypothetical API is needed, is my point.

I think the real question here is this: are there currently pages using Caja that would be secure in a process-isolation world with the existing set of web APIs but would become insecure in a world in which the getOriginals API is added? This is not actually clear to me, given that Caja attempts to censor the global and the global is where the new APIs live. But this is the fundamental question that needs to be answered.

Put another way, if there are currently sites that are secure (assuming process-isolation) but we add an API that makes them insecure, that seems bad to me. That's something to avoid in API design. I am not making the claim that this specific API falls into this bucket; I am trying to understand whether it does or not. @erights, can you comment on that please?

bzbarsky commented 6 years ago

And to be clear, I think the assumption that your security folks are infallible and just need to explain how the world works to everyone else is somewhat questionable. I also think it's worth understanding the constraints on a solution before designing it, which is why I want to pin down whether we have a problem here or not. But obviously I'm not going to tell you how you should spend your time and I don't have a problem with proceeding on the assumption that there isn't a problem for the time being, as long as we don't then use the effort spent as justification for pressing on if there is a problem...

benjamingr commented 6 years ago

which is why I want to pin down whether we have a problem here or not.

I think Domenic's point is that this proposal is still very young and early - and that there may be other blockers before this one needs addressing. It's worth investing time into seeing whether this proposal meets the goals of why it exists before considering the edge cases like the effect on caja.

From a solutions point of view there are multiple ways to address the criticism here - it can be blocked from the content security policy for example or placed under another limitation or get-originals itself could be "locked down" in certain modes.

The use case I personally care about (as a platform doing a layered API in JS) isn't relevant to the one Caja is solving (where the JS is the opposite of the platform that needs security).

Now - from a proposal-author point of view - a bunch of people from the same clique (with a history) came to the proposal and started saying why it won't work. From a proposal author point of view that's not really a great way to interact. I do think Domenic could have been more charitable in this discussion but this sort of thing where a bunch of people gang up can really easily feel like an attack (even when it's not meant as one).

bzbarsky commented 6 years ago

@benjamingr Thank you, that is a good point. This was definitely not meant to be an attack, but a suggestion that the interactions be explored so we don't end up with surprises later. I'm sorry it felt like the former.

jswalden commented 6 years ago

from a proposal-author point of view - a bunch of people from the same clique (with a history) came to the proposal and started saying why it won't work

FWIW I don't think I would consider myself part of this clique, and to some degree I'm mildly dubious of the importance of "secure JS". For the pretty-esoteric use case of running untrusted code, I think source-rewriting a well-defined subset of the language, in concert with site-side code to censor things like Function and so on, is adequately effective. (This is roughly what Facebook did for this, way back in the day, and -- to my surprise -- I found myself concluding that the approach seemed actually workable if done correctly. Even if Facebook wasn't doing it correctly at the time.) It's more work for these sites, but they're already doing something a bit "out there". Forcing them to rewrite a little before running is IMO not that great a burden.

But as of ES5 making a whole bunch of builtins throw if passed null or undefined, it seems like we have chosen "allow embedding unrewritten source code safely" as a design goal. In light of that, I will at least get the criticisms that point of view takes into discussion here. Even if I don't seriously care one way or the other about those aims.

domenic commented 6 years ago

Hey all, I wanted to apologize for my terseness in this discussion. As Benjamin alludes to, there is a long history of this sort of thing being discussed in other venues and I wasn't enthusiastic about getting into it again here. But I apologize for letting that color my responses. And I want to reiterate that I look forward to working with folks to explore any concerns as this API gets further along in its development.

littledan commented 6 years ago

Does the Stage 2 Realms proposal give us a way through here? I believe a newly created realm will have only JS builtins, not web platform builtins. If this proposal is implemented, then a system like Caja could execute code in a new Realm and avoid seeing this new API (unless it decides to somehow polyfill it).

In the short term, I wonder if Caja security could be ensured by direct eval'ing the sandboxed code within an inner scope which shadows these properties of the global object. I have a vague recollection that Caja may use this technique for similar reasons in other cases, though I may have misunderstood.

mikesamuel commented 6 years ago

@domenic This not only affects proposals and practices that involve running untrusted code.

If trusted library code written by web security specialists can't mitigate or prevent access to sources of authority that are known to often be used sloppily, then less code written in good faith can be considered trustworthy.

Specifically, this would affect things like the trusted types polyfill which seeks to expand the set of modules which can be considered trustworthy w.r.t. XSS and some other vectors by limiting the damage when untrusted inputs reach trusted sinks.


When consulting on application security, I often have to identify chokepoints that payloads for a particular class of attack would have to pass through, and see if I can mitigate enough there to allow the bulk of unreviewed code to be considered irrelevant to that class of attack.

This proposal seems to enable end-runs around such chokepoints, making it harder to bring both static analysis and dynamic enforcement to bear to limit the downside when modules in good faith have bugs or mismatched security assumptions.