Closed domenic closed 3 years ago
My thoughts are that this is a bit weak by itself due to the lack of cross-realm references. However it does make me wonder if something like a sync version of what Puppeteer/Playwright do with JSHandles + ability to structure clone a handle would work.
e.g. For example:
const realm = new Realm();
const arrayHandle = realm.evaluateHandle(`[1,2,3]`);
// Push an item into the array in the realm, non-handles are structured cloned
realm.evaluate(`array.push(value)`, { array: arrayHandle, value: 4 });
// We can also ask for a handle to be structured cloned back to us
const arrayClone = arrayHandle.cloneIntoThisRealm();
console.log(arrayClone); // [1,2,3,4]
For things like callback patterns, the ability to have handles works naturally, for example suppose we wanted to expose something like setTimeout
in the realm:
const realm = new Realm();
const realmSetTimeout = realm.createFunction(
function setTimeout(delayHandle, callbackHandle, ...callbackArgumentHandles) {
const delay = delayHandle.cloneIntoThisRealm();
setTimeout(delay, () => {
realm.evaluate(`callback(...args)`, {
callback: callbackHandle,
args: callbackArgumentHandles,
});
});
}
);
realm.globalThisHandle.setPropertyDescriptor('setTimeout', {
value: realmSetTimeout,
enumerable: true,
configurable: true,
writable: true,
});
A rough API would be something like this:
type StructuredClonable = ...;
class RealmHandle {
// Clone the value from the realm into this Realm using structured clone
cloneIntoThisRealm(): StructuredClonable;
// Object meta operations, same as Reflect.* except
// accept RealmHandles and return RealmHandles
apply(
target: RealmHandle,
thisArgument: RealmHandle | StructuredClonable,
argumentsList: RealmHandle | Array<RealmHandle | StructuredClonable> | StructuredClonable>;
): RealmHandle;
construct(...) ...
...
}
class Realm {
// A RealmHandle for the global object
get globalThisHandle(): RealmHandle;
// Evaluate and clone the return value, this is basically a shortcut for
// .evaluateHandle().cloneIntoThisRealm();
evaluate(
code: string,
scope: Record<string, RealmHandle | StructuredClonable>,
): StructuredClonable;
evaluateHandle(
code: string,
scope: Record<string, RealmHandle | StructuredClonable>,
): RealmHandle;
// Creates a function in the Realm that calls the given function with RealmHandles
// for all arguments passed to it
createFunction(
func: (...args: any[]) => StructuredClonable,
scope: Record<string, RealmHandle | StructuredClonable>,
): RealmHandle;
}
@domenic I want us to meet at a middle-ground for sure. I asked my teammates to take a look.
In particular, it is not able to create reference cycles between cross-realm objects. Because all communication is via cloned messages, there's no way to communicate to the garbage collector that an object in the outer realm depends on an object in the inner realm, and vice versa, so that the cycle can take part in liveness detection.
This is one of the concerns that requires specific review, I appreciate that you're weighing the options here.
One of the problems is that we don't have structured cloning in the language. I have a proposal for that https://github.com/Jack-Works/proposal-serializer if you're interested.
@domenic thanks for putting the time to do this write up, we have been debating this with @syg for few weeks, in fact, we did some prototyping around it to measure perf (I believe that's not a deal breaker, which is a good news). Also, we did some homework to see if some of the existing membranes will work with this proposal, and here is where things become complicated (as @leobalter mentioned above).
One thing that occurs to me is that maybe we can provide other internal mechanism to help overcome this issue. I honestly don't even know if this is possible, but here is an idea:
realm.eval(`globalThis.foo = { x: 1 }`);
realm.eval(`Array.prototype`) === realm.eval(`Array.prototype`); // to yield `true`
realm.eval(`foo`) === realm.eval(`foo`); // to yield `true`
Basically, what I'm asking is if the UA can do some ref-tracking across this boundary, so the incubator realm's ref (in this case empty plain object based on your example) can continue to be linked to the corresponding ref from the realm, and release that memory when the realm doesn't need access to that object anymore.
This is clearly a lot different than the already well establish structured cloning algo, but a variation of it to reuse some references when possible. Since both realms are in the same process, it might be possible. Similarly, we will have to define how that works for Realm.set/call/etc. But the bottom line is that such mechanism will eliminate the necessity of doing any user-land book-keeping for references that needs to be tracked across references, clearing the way for a membrane to support any kind of virtualization.
I would be interested in seeing the explainer expanded to cover use cases where cross-realm cycles are important. I don't think "making existing membranes work" is a use case. I'd be interested to see things on the same level as the current explainer's "DOM virtualization" or "test frameworks". From what I can tell, no use case mentioned so far requires such cycles.
I'm writing an expansion over the sandbox use case and some thoughts for perhaps have a workaround to overcome @caridy's concerns. I'll post it back here when I have a proper review.
We have the TC39 plenary and a TAG Review meeting next week, they are too close for us to bring any good conclusion of potential next steps. I can say we are trying to understand better this new proposal and identify what might not work and how it would work for us. Our goal is to find an agreement point.
Meanwhile, I'm still going to expand the use cases in the explainer.
@domenic can you expand a little bit about whether or not the realm will be running in the same process / same agent? or if it must run in a separate process? Or is it that you have intentionally left that part out of the proposal to let the UA to decide about that? I'm asking because that might be another differentiation aspect between the two of them. Can you clarify?
This proposal has them running in the same process, so that the communication is synchronous. I would of course prefer them to run in separate processes, and for the communication to be asynchronous, per #238, but such a modification does break some of the use cases mentioned in the explainer. This proposal is meant to cover all of the explainer's use cases and as such is synchronous/same process.
Basically, what I'm asking is if the UA can do some ref-tracking across this boundary, so the incubator realm's ref (in this case empty plain object based on your example) can continue to be linked to the corresponding ref from the realm, and release that memory when the realm doesn't need access to that object anymore.
I'm reasonably certain we can reuse the Membrane concept even with IsolatedRealms. Whenever an object crosses the boundary, instead pass a UUID (or anything unique, really). This will require wrapping the IsolatedRealm
's methods calling into the realm, and the callParent
calling outside the child.
For instance, say we had an stored in the foo
variable inside the realm:
realm.eval(`foo`) === realm.eval(`foo`);
Here, eval
would return a Proxy
, and running it twice would have to return the same Proxy
instance. So, we'll store a Map<uuid, WeakRef<Proxy>>
. To get around the lifetime issues, we'll need the child realm to track foo
with a FinalizationRegistry
, and when foo
is finally reclaimed, we can just tell the parent realm to delete the uuid
.
Using a Membrane, we'd additionally be able to express a cycle without any issues. If foo.bar === foo
, then performing const yFoo = realm.eval('foo'); yFoo.bar === yFoo
will be true, too.
To ensure the child's foo
isn't reclaimed while the parent's yFoo
is still alive, the child realm will need to store a Map<Object, uuid>
to strongly hold onto foo
. And, the parent realm will need another FinalizationRegistry
to notify it when yFoo
is reclaimed. If so, tell the child to remove its key for foo
, and we cleanup the memory cycle.
Using a Membrane, we'd additionally be able to express a cycle without any issues. If
foo.bar === foo
, then performingconst yFoo = realm.eval('foo'); yFoo.bar === yFoo
will be true, too.
Unfortunately cross-system cycles using UUID's can leak memory as this issue on the WeakRef
proposal demonstrates. Returning an object (or symbol if they're allowed as weakmap keys) would be better as the engine itself can track cycles properly.
EDIT: Actually it's unclear with the proxy's returned by eval
if this suffers the same problem as the linked thread, this would need some investigation.
WeakMaps enable cross-membrane cycles to be collected. WeakRefs do not. That was one of the first motivations for splitting the concepts.
Why not both? It looks like with the proposal accepted the kind of realms proposed in this issue could be implemented in userland. So maybe just do it and make programmers widely aware of its advantages. If deemed desirable, they could be included in the language as well as the full version, e.g. as a subclass.
I had a long conversation with @syg about the features needed for membranes to function. It seems to me and @syg that there are other things that we can do to solve it, and we can consider them orthogonal and complementary to what it is being proposed here. That, I believe, it is a good thing. We will try to put together some material to explore those other options as a separate proposal. For now, my focus is to try to understand all the details of what is being proposed, and the implications of such in the context of the current realm proposal.
@domenic, I finally got a chance to look at this in detail, and I can say that I'm on board. This, IMO, is a good compromise. I will continue discussing it with other folks, for now, I have few notes:
eval
and handler
are good in principle, but I will like to explore other names to avoid confusions, I will put some time on this next week.eval
), or errors triggered on the next turn? similarly, what can the realm do if the handler
throws?__proto__
is powerless, and if we can mark them accordingly, we might be able to share them between the two realms. I'm thinking of something like Object.createSharableIdentity()
, which produces something equivalent to Object.freeze(Object.create(null))
, and marks it with an internal slot that can be used by the structured cloning algo to simply share those objects between realms just like any other primitive value, allowing keeping life references between realms in a weakmap. Something like that might be enough to solve all virtualization cases that we have discussed. Clearly, this can be a separate proposal, which is orthogonal and complementary to this.For the registers, we had a sync about this sharableIdentity API at SES Meeting on Wednesday and it's got a generally positive feedback.
I'm looking forward for @domenic's feedback and hopefully we can set a working path forward.
Hi @leobalter , the only email address I have for you no longer works. Assuming you have one for me, please send me your's as well as the recording of the last SES session. Thanks.
Could Symbols as WeakMap keys provide this shareable identity concept?
Could Symbols as WeakMap keys provide this shareable identity concept?
Assuming they can pass through and be returned by realm.eval()
they should be of identical power as one could always simulate Object.createSharableIdentity()
with them or vice versa.
The main advantage I see for Object.createSharableIdentity()
is that it would be a little easier to discriminate them from other symbols that pass through .eval()
.
I'm looking forward for @domenic's feedback and hopefully we can set a working path forward.
I'm glad to hear that this proposal is being taken seriously, and could work for at least some of the champions! As I said, it could work for me and the constituencies I represent. (Remaining major issues remain, such as #261, but it at least is a large step in the right direction.)
I'll caution that @syg and I are still working to understand whether Chrome security is comfortable shipping sync/in-process realms in any form. (It turns out, they do not agree with my statement "This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.") This may be ameliorated with a name change, e.g. InsecureRealm
or SideChannelAttackableRealm
, but it's too soon to make any concrete statements. We're continuing to push for resolution internally.
Regarding shared identity things of the sort you discuss, I think exploring such things makes sense as a follow-on. @syg would be the best point of contact there, especially given his work on disjoint object graphs for cross-realm concurrency, which seems pretty related.
in the original proposal, error propagation was supposed to happen at the agent level (root realm with DOM semantics), what will be the proper way for the owner of the realm to deal with errors? including synchronous error (when calling eval), or errors triggered on the next turn? similarly, what can the realm do if the handler throws?
I don't have any concrete ideas or opinions here, but I'll point out that Error
objects and subclasses are structured-cloneable, so I think most any semantics could work, as long as the error values get structured cloned on the way in or out of the realms.
Few more notes:
Object.createSharableIdentity
), and I plan to work with @littledan on this topic considering that he is the champion on the existing proposal to support this.@annevk how this proposal alternative looks to you? I believe the existing Realms proposal has some ongoing issues that might be mitigated from @domenic's proposal.
If it goes through a positive path - including for @caridy's additional work - I'd quickly start updating the explainer and spec text here and preparing a new one for createSharableIdentity
.
I see that investing in the symbols as weakmap keys would be a better solution and investment for this. I'll get this sync'ed with @littledan and @caridy.
It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk.
It's not a security mechanism
I see this slogan causing a lot of confusion. We need better distinctions. See https://agoric.com/taxonomy-of-security-issues/
Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.
if you don't trust the code you're still putting yourself at risk
I don't trust the code I wrote yesterday. We manage and mitigate risks. We need to do better at that. We never eliminate risks.
It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk.
Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.
I would point out that Realms, by themselves, are not a confidentality mechanism. But they are still a practically† necessary part of any JS confidentality mechanism. I believe, but @erights would need to confirm, that the SES proposal is what is necessary for confidentality.
Availability is covered by neither this Realms proposal or the SES proposal, but rather would need to be covered by an Agent proposal (which has often been mentioned in passing in the various realms/compartments/ses proposals, but nothing concrete has been proposed thus far). Most hosts already give a way to create agents (e.g. Worker
) so this is presumably lower priority, as one can just use SES realms inside of a host-provided agent.
† Strictly speaking SES could be implemented purely with the root realm and compartments, but lockdown
permanently changes the realm in ways incompatible with a lot of existing code (for example frozen intrinsics, removal of all stateful/non-deterministic APIs such as Date.now()
or Math.random()
, etc).
Hi @Jamesernator thanks for the comments!
It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk.
Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.
I would point out that Realms, by themselves, are not a confidentality mechanism. But they are still a practically† necessary part of any JS confidentality mechanism. I believe, but @erights would need to confirm, that the SES proposal is what is necessary for confidentality.
Yes, the full SES proposal enables the defense of confidentiality explained in the security taxonomy document. Realms help create a multi-realm world in which the SES defense of confidentiality has a place, but your footnote is also correct.
† Strictly speaking SES could be implemented purely with the root realm and compartments, but
lockdown
permanently changes the realm
And thus SES by itself does not enable safe co-existence with code not running under SES. SES+Realms enables both. It turns out this mixed use case has become important for the MetaMask LavaMoat project. Attn @danfinlay @kumavis
in ways incompatible with a lot of existing code (for example frozen intrinsics, removal of all stateful/non-deterministic APIs such as
Date.now()
orMath.random()
, etc).
Experience at Google, Salesforce, MetaMask, Node, and Agoric show that SES is also compatible with a huge amount of existing code.
Date
and Math
in the SES start compartment after lockdown
are the fully powered ones with Date.now()
and Math.random()
. The shared Date
and Math
objects given to all other realms by default are, by contrast, powerless. However, code in the start compartment can explicitly endow its powerful ones forward if it chooses by
const c = new Compartment({ Math, Date });
This is consistent with the start compartment being where the virtual (or actual) host objects reside, the objects that provide I/O access to the outside world. Code running in the start compartment has these powers. Code running in created compartments by default do not. But code with powers can create compartments endowed with these powers or powers derived from these.
This put the initial SES code in control of the tradeoff between how much I/O or mutation power to risk on code in other compartments vs the compat cost of not allowing it.
Availability is covered by neither this Realms proposal or the SES proposal, but rather would need to be covered by an Agent proposal (which has often been mentioned in passing in the various realms/compartments/ses proposals, but nothing concrete has been proposed thus far). Most hosts already give a way to create agents (e.g.
Worker
) so this is presumably lower priority, as one can just use SES realms inside of a host-provided agent.
Exactly. Well put. I keep hoping to write down a concrete first Agent proposal. A lot of what's needed was bundled in https://web.archive.org/web/20120725121957/http://wiki.ecmascript.org/doku.php?id=strawman:concurrency already in 2012, much of which became standard promises. Agents would complete the "Vat" part of that proposal. But, of course, a lot has happened since then ;)
It is also worth noting that the Moddable, TC53, and Agoric uses cases are for mutually suspicious code coexisting safely within a single Realm. Because there is only one Realm and it is a SES Realm, the Realm API is not directly relevant. It is only relevant for SES for the larger ecosystem of uses like the LavaMoat one MetaMask is building.
I like the idea of a synchronous interface which is restricted to passing primitives across Realm boundaries. Unlike structured clone, if we permit all primitives, then we have a mechanism for sharing identities (through Symbols as WeakMap keys) which is critical to solving the cycle problem (inherent whenever you have code running across boundaries, e.g., see the Wasm GC proposal for another case where this need comes up).
A minimal API for this kind of Realm would be a two-way, synchronous communication channel. This basically means that each side should have a function that they can call, passing in a primitive, that causes a callback to be called, synchronously, on the other side.
Here's an idea for how to set up such a channel, with a method Realm.prototype.connect
, to create a Realm where you can pass in a Number, and it will call back with that number + 1:
let n, send
let r = new Realm;
await r.connect(s => { s = send; return val => { n = val; } },
module { export default s => n => s(n+1); });
send(4);
console.log(n); // 5
The first argument to connect
, as well as the default-exported function from the module specified by the second argument, are expected to be functions which take a callback as a parameter (where this callback sends a message to the other side) and returns a callback which is called when the other side calls them. The outer function on both sides is called to set the system up (returning a promise, so we have time to load the module specifier which is the second parameter within the Realm), and then the inner function is called later, synchronously, as needed.
I believe that this would be enough to implement membrane frameworks, to "locally" virtualize objects in different Realms. I suspect such Proxy-based virtualization is the most ergonomic way to make use of this sense of Realms, and it would follow ecosystem experience using Proxies in popular projects like Immer.
About ergonomics: I don't see this restricted version of Realms really being usable without a framework being built around it. I think that's OK, as the use cases I know of do correspond to infrequently-defined, frequently used boundaries. The important things, then, are safety, correctness, and efficiency, which I think this sort of API meets. This means that the audience for good ergonomics are the people who write these frameworks, so simplicity and clarity may be more important than including more complicated conveniences around using properties of global objects.
To confirm, the Symbol
wrappers would not be shared, just the underlying immutable symbol primitive? In particular it's important this wouldn't allow each side to access the others' Symbol.prototype
.
I'll emphasize I don't have opinions on the channel API, just on what goes across the channels. Yours seems nice and small and elegant compared to mine, although it takes a dependency on the yet-unresolved question of whether realms can have separate module maps, i.e. #261, and it encourages the IMO-bad module-that's-a-function-wrapper pattern discussed in https://github.com/tc39/proposal-js-module-blocks/issues/21. Using a string of code might be a good way to sidestep those issues.
@domenic Right, just Symbol primitives would be permitted, not wrappers (which are not primitives). And yes, I'm assuming that Realms have their own module map (which is necessary if Realms can use ES modules at all, otherwise it'd be a way to communicate objects across Realms).
My example used module blocks, but this was just for convenience. It would be possible to write it in a separate file in exactly the same way. (TC39 accepted Google's proposal to move module blocks to Stage 2, so I thought it would be a reasonable thing to build off of in this post; I can't really picture any changes to module blocks that would make it unsuitable for a case like this.)
I'd like to avoid APIs which are based on strings of code, as they are quite unfriendly to Web features like CSP. The API I described above would permit a simple framework around Realm.prototype.connect
to implement string-based eval if anyone really wants it.
I'll emphasize I don't have opinions on the channel API, just on what goes across the channels.
@domenic do you have any opinion on restricting the proposal to only allow primitive values rather than something more complicated like a structured-cloning algo? This seems like a good way to simplify things while still allowing serialization in user-land (with their own protocol on top of primitive values), while keeping the door open for more complex structures like records and tuples in the future.
No opinion. We may want to expand structured clone to support symbols, since it's pretty weird that this would be the first way of cloning them across realms, and that would give a nice property that this synthetic-realm-boundary-cloning is a subset of web-platform-realm-boundary-cloning (i.e. structured cloning).
I'd be all for expanding structured clone to Symbols if implementers are up for it (and I have some ideas about how that could work), but I don't understand the connection: the idea here is to simply check if the value is a primitive; no cloning is needed.
This comment describes a possible solution for the API of the Realm to work with the new isolation model described in this issue.
declare class Realm {
constructor();
eval(sourceText: string): any;
Function(...args: string[]): Function;
AsyncFunction(...args: string[]): AsyncFunction;
import(specifier: string): Promise<???>;
}
A bridge function is just a function that is created via Realm.prototype.Function
, which returns a function in the incubator realm who's body is evaluated in the Realm itself. E.g.:
const r = new Realm();
const doSomething = r.Function('a', 'b', `return a + b;`);
doSomething(1, 2); // yields 3
A good analogy here is a cross realm bound function, which is a function in a realm that is available in the incubator realm, this function's job is to call another function, this time, a function inside the realm, that might or might not return a completion value.
This mechanism allows the incubator realm to define logic inside the realm without relying on populating the global object with global names just for the purpose of communication between the two realms.
Additionally, this allows the incubator to easily pass identities when invoking a function inside the realm, e.g.: a symbol, which is not possible via Realm.prototype.eval
because the Symbol is not something that you can point to from source text. This feature provides a synchronous communication mechanism that allows passing and returning primitive values across the two realms.
f.x = 1
will throw in strict mode).To add support for native promises when communicating between realms, it seems that by adding Realm.prototype.AsyncFunction
, we might be able to provide extra capabilities that can be used to define an async protocol between the two realms.
const r = new Realm();
const asyncFunctionInsideRealm = r.AsyncFunction('x', `return await (x * 2);`);
asyncFunctionInsideRealm(1); // yields a Promise instance in incubator that eventually resolves to 2
Of course, the promise instance received by the incubator realm is not the one produced by the body of the function, but a wrapping one with the identity associated to the incubator realm.
f.x = 1
will throw in strict mode).This works great for the incubator realm since it provides all the tools to create a pull system from the incubator call, but still doesn't provide an easy mechanism to implement a push system from the realm itself. In my opinion this is not a deal breaker, and can probably be implemented in user-land using something like an async iterators protocol.
arguments.callee
in Bridge Functions?Since the function itself is sloppy, what should happen here? Should Realm.prototype.Function
create strict functions only? Probably.
Realm.prototype.eval
really needed?Probably yes, two main reasons:
Realm.prototype.Function
, but that's subject to a global eval
lookup inside the realm, e.g.:const r = new Realm();
const directEvalInsideRealm = r.Function('s', `return eval(s);`);
directEvalInsideRealm(`1 + 1`); // yields 2
Specifically, if the code inside the realm removes globalThis.eval
, the incubator realm will have no way to eval anything.
Function
evaluator, it seems reasonable to have eval
as well.Those two reasons seem strong enough IMO.
Realm.prototype.import
really needed?Probably yes, one main reason: convenience.
It is possible to do the exact same thing using Realm.prototype.AsyncFunction
:
const r = new Realm();
const dynamicImportInsideRealm = r.Function('u', `return import(u).then((ns) => true);`);
dynamicImportInsideRealm(`/path/to/module.js`); // yields a promise that resolves to true when the module is evaluated
Note: since one of the invariants for Realm.prototype.AsyncFunction
is to resolve to a primitive value, we can't just return the promise to the namespace.
Realm.prototype.import
resolves to?This is an open question.
This is an open question. Should these 3 evaluation mechanism (eval
, Function
and AsyncFunction
) be subject to that? I suspect it should, then how useful is the Realm that only allows import without a feedback loop to the incubator realm?
The API is cumbersome, and it requires definition of global names, which is always tricky.
It seems that an API like that will:
a) force async mechanism to be in place. b) imposes a protocol that relies on the export names defined in the module, which is new. c) it requires module blocks to be available to do anything useful with it.
Why Function and AsyncFunction, but not GeneratorFunction and AsyncGeneratorFunction?
Max/min, @ljharb. For now we already need a wrapper for the returned promise already and I'd like to see how it develops to eventually get traction on the returned iterators if the Generator, AsyncGenerator become necessary.
For now just the Function and AsyncFunction are enough to resolve and unblock the problem for this proposal use cases. Ofc, we would further explore needs for expansions.
What should happen if CSP is preventing evaluation?
Assuming the module blocks proposal moves ahead in somewhat it's current form, this seems like something that could be solved with those, the CSP of a module block should just be the same as the module it's lexically created in.
Ideally we'd have a way to do scripts as well with similar machinery, I imagine asset references would be able to be treated the same by CSP as they're also statically declared just like imports.
Example:
const mod = module {
import plugin from "./plugin.js";
plugin({ foo: "bar" });
}
asset scriptContent from "./classicScript.js";
const ??? = await realm.import(mod);
// Needs to be async as script needs to fetch in some environments, this would
// need to be host provided similar to import resolution, the return value would just
// be some token object that the host would recognize as safe
const script = await realm.prepareScript(script);
realm.eval(script);
I think it's important that we have a mechanism which permits two-way message-passing without eval'ing strings. I think two-way synchronous callbacks would be a simpler, more general primitive to build this off of than Function and AsyncFunction constructors, so I'm confused about the motivation for @caridy 's proposal. I'm not opposed to having that functionality for convenience, though.
To respond to @caridy 's feedback:
a) force async mechanism to be in place.
Sorry, what does this mean? I have a couple ideas
On one hand, when you have a message to send to the other side of the Realm, you can receive a response synchronously, as in my example above.
On the other hand, it is an async mechanism to set up a communication channel. This is inherent whenever you will want to be able to load modules which have code that the function in the other Realm calls. I think it's important that we don't force people to use a big inline script just because they're using Realms, and that code can be used from modules. This means assuming the cost of async operations for code loading.
Even if you're not importing modules, it is good to be able to have support code outside of the function, in the same module. It is awkward to do this if all you can pass is a function body, with no common bindings to share, and nothing that runs at set-up time. However, in principle, this could be done synchronously.
b) imposes a protocol that relies on the export names defined in the module, which is new.
This is true, it uses the default export of modules which are passed to Realm.prototype.connect
. Is this a problem? I was imagining that such modules would be purpose-made.
c) it requires module blocks to be available to do anything useful with it.
The mechanism I proposed above does not depend on module blocks: any module specifier which is passed into Realm.prototype.connect
works--so, it could be a separate file.
One advantage to @littledan's model is that is doesn't rely on Realm.eval
and Realm.*Function
, which makes it possible to use Realms even if CSP disables eval. If eval-ing strings is the only possible way to setup Realms, then it pretty much locks-in the web applications into unsafe-eval
(or requires to set up another CSP machinery and host hooks for Realms' evaluation).
@leobalter max/min is a philosophy i think many of us consider was a mistake in ES6. It doesn't make sense to me to add 2 out of the 4 Function globals, that's just creating inconsistency.
Like @littledan, I also think it's critical we have mechanisms that are not just evalling strings.
I believe we should avoid unnecessary complexity.
@ljharb
The reason I see for adding [Async]Generator functions is just because they are other functions formats. We are not discussing what is available in the new realms, but the channels we need to operate with this Realm. Is there any use case that needs iterators through these channels?
@littledan
As I mentioned somewhere in the past, I'm in full support for module blocks and I still believe this example should not use them.
We already have a long time baggage of challenges for this proposal, and I'd rather go through a solution that is compelling enough without another proposal that has its own challenges ahead.
As @caridy has mentioned, connect
forces async mechanism, which is a complicate trade-off over CSP handling.
b) imposes a protocol that relies on the export names defined in the module, which is new. c) it requires module blocks to be available to do anything useful with it.
I believe these concerns can be mitigated if we use the specifier instead of a module block in the example.
I simplified the names and arrow functions to support my own reading of the example.
// connector.js
export default function(fn) {
return v => fn(v + 1);
};
// main.js
let n, send;
const r = new Realm;
await r.connect(
function(sender) {
send = sender; // The original example had the other way around, sender = send
return val => { n = val; };
},
'./connector.js'
);
send(4);
console.log(n); // 5
This example looks better, and the names should be distinct enough to avoid confusion. It still seems to be only one function per connecter, along with one new async tick.
Can you help me with the steps here, like where the sender function is created, and how the returned val => { n = val; }
goes. Same for what are the steps when I call send(4)
?
I'm assuming we have internals connecting these functions but I'm getting lost every time I try to create a step by step.
GeneratorFunction and AsyncGeneratorFunction
@ljharb I think you're right... if those can be created from syntax, and can work as bridge functions, then I don't see why not exposing them to have a complete API on the Realm.
FWIW, I'm happy to add these extra bridge functions if the concern is a dealbreaker.
Yeah, I can see how it is a problem that creating a realm and a communication channel is an async operation in my suggestion. At the moment, I can't think of a solution which both resolves the no-eval issue and is synchronous. I will keep thinking on this issue.
Yeah, I can see how it is a problem that creating a realm and a communication channel is an async operation in my suggestion. At the moment, I can't think of a solution which both resolves the no-eval issue and is synchronous. I will keep thinking on this issue.
Perhaps we could have a "script block" that creates unevaluated scripts similar to module blocks e.g.:
const someScript = script {
console.log("Hello");
}
Also thinking about your .connect
idea, it could be simplified to simply returning the realm-wrapped function (or a Promise for it in the case of modules), rather than doing the awkward callback thing:
// Synchronous for scripts, this would be like .eval, but wraps the returned
// function for the current realm, so addOne !== the lambda inside the script
const addOne = realm.connectScript(script {
(n) => n + 1;
});
// Asynchronous for modules, this would act like .import, but the default export
// is treated as a function to wrap, the addTwo in this realm is not the same as
// the addTwo within the realm
const addTwo = await realm.connectModule(module {
export default function addTwo(n) {
return n + 2;
}
});
This comment describes a possible solution for the API of the Realm to work with the new isolation model described in this issue.
declare class Realm {
constructor();
eval(sourceText: string): PrimitiveValueOrCallable;
importBinding(specifier: string, bindingName: string): Promise<PrimitiveValueOrCallable>;
}
In this API review, the import
method becomes importBinding
(name open for bikeshed). Allowing injecting code in the constructed realm while getting a binding value. The value is restricted to primitives, while it would also auto wrap a connecting function. There isn't any need to provide any argument to the Realm constructor and importBinding
would have a meaningful promise resolution.
const realm = new Realm();
await realm.importBinding('console-shim', 'default');
const redTrySample = await realm.importBinding('sampler', 'trySample')
// redTrySample can still receive primitives such as symbols, etc
const result = redTrySample(2, 3);
The wrapped functions can receive functions as arguments. This allows the constructed realm to trigger a callback in the incubator realm, without knowing about the incubator realm.
const realm = new Realm();
const redRunTests = await realm.importBinding('testFramework', 'runTests');
function reportResults(...args) {
/* ... manages results from args ... */
}
reportResults.noop = 1;
// The constructed realm receives a new function that would chain to
// reportResults when called with its given arguments.
redRunTests(reportResults);
// The connecting function created inside the realm won't have any access to
// the property 'noop'. It does not receive a strucuted clone of that function.
This API explores a modification of the Isolated Realms proposal while trying to preserve some of its goals. It has similar level of expressivity and isolation. It still disallows direct access between the parent and child realms, but it does not use structured cloning. This is possible through auto wrapping connected functions.
In this API, any action hitting a disallowed completion would throw an exception. The only values that can be transfered are primitives (string, number, boolean, symbol, undefined, null, BigInt, etc). There is a special behavior to wrap callable objects, generally functions. By callable objects, we consider any object with a [[Call]]
capability. When the API evaluates a callable object completion, it should create an internal reference to it and a new function in the other realm that would receive this completion. When that new function is called, it chains the call to the reference in the different realm transfering the given arguments. These arguments share the same restriction to only allow primitives and callable objects.
const realm = new Realm();
try {
realm.eval("[1, { foo: 'bar' }]");
} catch {
// Throws a TypeError
}
// If you try to get access to a constructed realm's constructor,
// you get a wrapped function, which isn't very useful as it can't return
// object values:
const redArray = realm.eval("Array");
// redArray is only another function that would eventually chain the call to the
// Array constructor in the other realm.
try {
redArray();
} catch (err) {
assert(err.construcor === TypeError);
}
// The wrapped functions are always frozen and do not share properties!
assert(Object.isFrozen(redArray));
assert(redArray.__proto__ === Function.prototype); // not from the other realms prototype
The wrapped functions allow setting values in the other realm including Symbols while providing more API flexibility.
const realm = new Realm();
const mySymbol = Symbol();
const fn = realm.eval(`(function(x) { globalThis.foo = x; })`);
fn(mySymbol); // equivalent to the previous realm.set("foo", "bar");
const result = realm.eval('globalThis.foo');
assert(result === mySymbol);
The wrapped functions does some sugar for the previous realm.call
, while avoiding fingerprints in the inner realm globals:
const realm = new Realm();
const add = realm.eval(`(x, y) => x + y`);
const result = add(2, 3); // equivalent to the previous realm.call("add", 2, 3);
assert(result === 5);
The avoided fingerprints also remove a need for such things as globalThis.callParent()
in the constructed realms.
There are more details in this README file.
From the developer's perspective, I think the best way is to make the membrane default and allowing opt-out.
(await new Realm().import("./val")).array instanceof Array
// true
Membranes by default can avoid mysterious behaviors for simple use cases.
2 Opt-out if they actually don't need a membrane
(await new Realm({ membrane: false }).import("./val")).array instanceof Array
// false
@Jack-Works let's keep the membrane separate, that's not a direct goal of the Realms proposal. At some point we might propose some membranes specific proposal that can complement this proposal.
Hi realms champions,
@syg and I have been considering a modification to the current realms proposal which trades some expressivity, to give better isolation guarantees. Essentially, instead of allowing direct bidirectional access between the parent realm and the constructed realm, all such communication would go through structured cloning. This ensures that the child realm never gets access to objects from the parent realm, thus making "sandbox escapes" such as those in #277 or https://github.com/nodejs/node/issues/15673 impossible by construction.
Sample code and API
We don't have strong feelings on the API for this; we'd like it to be as ergonomic as possible. But here is an initial idea.
For pulling values out of the constructed realm, into the parent realm: introduce
realm.eval()
.For getting values into the constructed realm, from the parent realm, a bare minimum might look like this:
but you could imagine something slightly more complicated, and more useful, such as
(Compared to async realm boundaries on the web, this solves similar use cases to
webRealm.postMessage()
.)Finally, for pushing values out of the constructed realm into the parent realm, you'd need something like this:
(Compared to async realm boundaries on the web, this solves similar use cases to
webRealm.onmessage
.)And, of course, we'd remove
realm.globalThis
.Use case analysis
This proposal is arguably better than the current one for many sandboxing use cases. In particular, for cases such as templating or computation where the goal is to have a (conceptually) pure function execute inside the realm, this architecture is ideal, especially in how it automatically prevents "impurities" from cross-realm contamination. In such cases, the values passed are often primitives, or if not, they're within the realm of structured clone: plain objects, arrays, maybe some Maps and Sets and Errors and Dates and typed arrays/ArrayBuffers.
For cases such as a virtualized environment, it requires more work, but probably on about the same level membrane-based approaches. That is, to perform operations inside the realm while interfacing with a same-realm object API, you would have to create proxies (either literal
Proxy
s or just wrappers) which perform the appropriate calls torealm.eval()
andrealm.call()
. And similarly for the reverse: if code inside the realm wants to operate on a inside-realm object while really doing work in the outside realm, the outside realm would need to do some setup, usingrealm.eval()
to inject some proxies which callglobalThis.callParent()
. (Probably that setup code would then also dorealm.eval("delete globalThis.callParent")
at the end.) This is equivalent to what is being done today in the AMP WorkerDOM example that the explainer cites, but by using synchronousrealm.call()
etc. instead of asynchronousworker.postMessage()
, it would overcome the challenges you discuss there.Other use cases like running tests in the realm fall in between. You'd need to inject a small shim into the realm which provides globals that the test library depends on (such as
console
), proxied to the creator realm. But then you'd just run the test library inside the global. I.e. instead of the explainer's current sample code, you'd writeThis also gives you a stronger guarantee that tests don't mess with the test framework, or with the outer realm, or with other tests, all of which are possible in the current explainer's sample code.
This proposal does lose some expressivity though. In particular, it is not able to create reference cycles between cross-realm objects. Because all communication is via cloned messages, there's no way to communicate to the garbage collector that an object in the outer realm depends on an object in the inner realm, and vice versa, so that the cycle can take part in liveness detection. To some extent this is a good thing, as cycles are an easy way to leak an entire realm. But from what I understand it does cut off some use cases that go beyond the ones mentioned in the current realms explainer.
Performance
Adding a structured clone step for all boundary-crossing operations could come at a performance cost. But, less so than you'd imagine.
In particular, since primitives are trivially cloneable, any operation which returns them would suffer virtually no overhead vs. the current realms proposal, when communicating across the boundaries. This can account for a large number of use cases: e.g., most computation use cases, or the test framework use case (where it's just passing console.log strings across), or many of the interesting virtualization cases. Other cases will be covered by small objects or arrays, for which the structured clone overhead is quite small (less than JSON serialization and parsing). It's only the case of needing to return a large, nested object graph where there might be a noticeable performance disadvantage.
It's also worth noting that although this proposal does have a lower theoretical performance ceiling than the current realms proposal, it's probably comparable to the current realms proposal plus the associated membrane machinery that's needed to preserve encapsulation. There might be interesting tradeoffs in the large nested object graph case. There, structured cloning across the boundary means a larger up-front cost, but after that initial cost is paid, subsequent accesses throughout the large object graph are fast and well-optimized. Whereas membrane use means every access throughout the wrapped object graph incurs membrane-related overhead.
Finally, I haven't thought much about this, but you could probably get ultimate performance™ by passing in a SharedArrayBuffer and doing all communication through that.
Conclusion
I'm optimistic that this proposal removes the most dangerous feature of realms, which is that they advertise themselves as an encapsulation mechanism, but it is extremely easy to shoot oneself in the foot and break encapsulation. This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.
There still remains a danger with people over-using realms when they need security or performance isolation, beyond just encapsulation. This still weighs heavily on me, and its conflict with the direction the web is going (per #238) makes me still prefer not providing a realms API at all, in order to avoid such abuse. But I recognize there are cases where synchronous access to another computation environment is valuable, and I think if we curtailed the footgun-by-default nature of realms by prohibiting direct cross-realm object access, I could make peace with the proposal.
I look forward to hearing your thoughts, and hope we can meet on this "middle ground" between no realms on the web, and the current proposal.