Clarify security story.

letitz commented 1 year ago

cc @arturjanc @arthursonzogni @syg @ajklein

Hi there! Chrome web platform security reviewer here.

We (Chrome web platform security folks) just took another look at this proposal and were concerned with the light treatment given to security concerns, as well as the explicit endorsement of ShadowRealms for untrusted code evaluation (such as plugins). Figma for example has backtracked and stopped using realms after discovering security issues with them.

We understand that the champions focus on "integrity" instead of "security", which is fine with us. We simply wish for security concerns to be made explicit, to avoid others falling into the trap of believing ShadowRealms are a meaningful isolation boundary for untrusted code on par with cross-origin iframes.

mhofman commented 1 year ago

Figma for example has backtracked and stopped using realms after discovering security issues with them.

To be clear, none of the issues were with the Realm API itself, but with the implementation of the shim back then.

Also all these issues were rooted in the leakiness of object graphs between realms, which by design is not possible with ShadowRealm.

meaningful isolation boundary for untrusted code on par with cross-origin iframes.

I believe this is also misleading. By themselves cross origin iframe don't provide any more isolation. The isolation comes from an implementation detail of most modern browsers to run cross origin iframes in separate processes. There is actually no API I'm aware of a developer can use to require the cross origin iframe to run in a separate process if they're worried about timing or memory corruption attacks ( I'm not sure if there is a good way to feature test the COOP/COEP header, but if there was, simply not defining them / opting out would probably help at least disable the high performance timers some of these attacks use).

leobalter commented 1 year ago

Thanks for the PR, @letitz! I'm reviewing the contents internally - including the comments in this PR - to get back soon with a response

letitz commented 1 year ago

To be clear, none of the issues were with the Realm API itself, but with the implementation of the shim back then.

Also all these issues were rooted in the leakiness of object graphs between realms, which by design is not possible with ShadowRealm.

Fair enough. That said, that shim and this spec are both vulnerable to side channel attacks such as Spectre. Figma plugins implemented using any version of Realms could likely read the contents of the mock they were attached to using Spectre attacks. Given a JS engine exploit, they would even have write access to those contents, as well as the user's figma cookies and more.

I believe this is also misleading. By themselves cross origin iframe don't provide any more isolation. The isolation comes from an implementation detail of most modern browsers to run cross origin iframes in separate processes.

While the HTML and TC39 specs do not speak of process allocation, they do speak of agent clusters (TC39, HTML) which are only defined to allow implementers the freedom to introduce process boundaries. There is a spec difference between ShadowRealms and cross-origin iframes in that the latter are placed in a different agent cluster. You are right that by itself and in a vaccum, this does not provide more isolation. In practice, however, this is an important distinction for security.

There is actually no API I'm aware of a developer can use to require the cross origin iframe to run in a separate process if they're worried about timing or memory corruption attacks

True, though there is a budding API in the shape of Origin-Agent-Cluster that goes in this direction.

( I'm not sure if there is a good way to feature test the COOP/COEP header, but if there was, simply not defining them / opting out would probably help at least disable the high performance timers some of these attacks use).

Unfortunately disabling the best clocks helps but does not fully fix the issue. Attacks simply get slower, they are not entirely prevented. For plugins, for example, that are intended to run alongside a document the user is actively editing for long-ish strectches of time, this might not be an obstacle.

mhofman commented 1 year ago

I'm using the standard definition of a security boundary here: a line between two parts of a system that have different trust and security requirements.

Other browsers may disagree, but because of Spectre Chrome has settled on the position that the only defensible security boundary against untrusted scripts is the process boundary. See more details here.

Given a JS engine exploit, they would even have write access to those contents

Security is simply too vague of a term. ShadowRealm like processes both provide an integrity boundary. If there is a bug in the environment creating this boundary, whether v8 or the OS, then of course code executing within the boundary may be able to trample it.

ShadowRealm definitely does not provide any availability protection, and it can block the whole JS Agent. This is unlike most OS which can preempt the processes.

Unfortunately disabling the best clocks helps but does not fully fix the issue. Attacks simply get slower, they are not entirely prevented. For plugins, for example, that are intended to run alongside a document the user is actively editing for long-ish strectches of time, this might not be an obstacle.

Finally we have confidentiality, which from what I understand is the sticking point and mainly being debated here. On its own, ShadowRealm does not provide any confidentiality boundary given the existence of time measurements powers. The side channel threat model doc linked above rules out attenuating clocks, however that is something the ShadowRealm API user is empowered to do (and why we have a requirement that all APIs available on in a ShadowRealm be configurable / deniable). Lowering the precision of clocks is not the only approach in these cases, and virtualizing them has worked successfully for some use cases (e.g. Cloudflare workers).

It's unfortunate that by default these timing powers remain in newly created ShadowRealm. In future proposals, we plan to introduce a way to "lockdown" the JS environment, which would amongst other modifications remove all clocks.

cross-origin iframes in that the latter are placed in a different agent cluster

True, though there is a budding API in the shape of Origin-Agent-Cluster that goes in this direction.

I have to admit I haven't followed the latest HTML spec developments, and hadn't realized cross-origin iframes where heading in the direction of having to run in different Agent Clusters. I seem to remember there were some observable synchronous interactions between cross origin documents, including I thought it used to be possible to pass SharedArrayBuffer between and a cross origin iframe and the parent document. I also don't quite remember the behavior of an active loop in a cross origin iframe, and if it'd block the parent run loop (aka does it provide an availability boundary).

mhofman commented 1 year ago

Also browsers are not the only JS hosts, and the same origin security model may not be applicable for those other embedders. By providing a building block allowing for better isolation at the JS layer, we can start building security models that are applicable to these other environments.

bakkot commented 1 year ago

Fair enough. That said, that shim and this spec are both vulnerable to side channel attacks such as Spectre. Figma plugins implemented using any version of Realms could likely read the contents of the mock they were attached to using Spectre attacks. Given a JS engine exploit, they would even have write access to those contents, as well as the user's figma cookies and more.

Figma's new implementation is exactly as vulnerable to side channel attacks, as I understand it - they're now executing JS by running it inside of quickjs compiled to wasm, which is still same-process and still provides access to timers.

To the extent that Figma is illustrative, I think it's an argument in favor of built-in ShadowRealms, not against. The approach Figma is now using is slower and no more secure than a built-in Realms mechanism, and their previous approach (the Realms shim) was more likely to be buggy than a built-in Realms mechanism. They would not have had a security incident at all (from their perspective) if built-in Realms were available.

bathos commented 1 year ago

They would not have had a security incident at all (from their perspective) if built-in Realms were available.

This had me a bit confused reading the first post in the thread. The “them” in “after discovering security issues with them” points at “ShadowRealms” in the prior sentence, but the issue which was linked to concerned a flaw in a JS-layer solution which ... wasn’t a ShadowRealm polyfill? Its model doesn’t even seem to resemble that of ShadowRealm? And they moved to a solution which is still same-process...?

I’ve got no stakes in how the ShadowRealm API is characterized on this front, but the example seems pretty misleading regardless of whether the must-not-be-considered-any-sort-of-security-boundary position is correct.

mhofman commented 1 year ago

The original Realms proposal provided a building block for creating an integrity boundary, but wasn't an integrity boundary by itself. Solutions providing an integrity boundary which leveraged Realms had to be extremely cautious and prevent user code running in the realm from getting a hold of objects in other realms. This is where the security issues mentioned above were.

Based on all the experience gathered with the Realms approach and the footguns it contained, including the experience from Figma mentioned above, the proposal morphed towards ShadowRealm and introduced a callable boundary. Unlike Realms, the callable boundary prevents any kind of object sharing between realms, and thanks to it, ShadowRealm does create an integrity boundary. With ShadowRealm, the incident Figma experienced would not have happened.

leobalter commented 1 year ago

@letitz @arturjanc @ArthurSonzogni @syg @ajklein can you please review the recent updates on #380 ~from 3306f0f~? Thanks!

caridy commented 1 year ago

@leobalter I'm going to create a new PR with the overlapping changes.

caridy commented 1 year ago

Closing this in favor of https://github.com/tc39/proposal-shadowrealm/pull/380, which is a more broader change in terms of the security story, incorporating these changes and a lot more.

tc39 / proposal-shadowrealm

Clarify security story. #378