whatwg / webidl

Web IDL Standard
https://webidl.spec.whatwg.org/
Other
405 stars 162 forks source link

State that by default all objects are created in the relevant realm of `this` #135

Open domenic opened 8 years ago

domenic commented 8 years ago

This has come up recently in https://github.com/w3c/webcrypto/issues/85 and also in some internal work Blink is doing on refactoring their bindings layer.

It appears that in all browsers, document.createElement, innerHTML, and an Event created through button.click() are created in the relevant realm of this. All of these cases are currently unspecified and use spec text equivalent to "create a new X object".

We should specify a few things:

(At first I thought that we should follow the ES spec and use the current realm for all cases. That would better match e.g. Array.prototype.map or Promise.prototype.then. But it seems like that's not what browsers do, so we'll just have to live with the inconsistency.)

annevk commented 8 years ago

This looks like a duplicate of https://www.w3.org/Bugs/Public/show_bug.cgi?id=24652.

annevk commented 8 years ago

See also https://github.com/w3c/web-platform-tests/pull/1381 for tests which I'm about to land.

jyasskin commented 7 years ago

Can we provide a default for in parallel blocks created from IDL methods and constructors too? We'd want to capture the relevant or current realm, respectively, at the point where the in parallel block starts, and use it as the "object-creation realm" within the block. Tasks posted from a context with an "object-creation realm" could inherit the same realm.

We still have some cases without an "object-creation realm", like https://webbluetoothcg.github.io/web-bluetooth/#notification-events. These tasks actually have an obvious realm to use, but it's lost when we pull the event loop out of the realm's settings object. We could extend "queue a task" to take its object-creation realm in order to have a default here too.

domenic commented 7 years ago

That seems not great to me. When you're in a background thread, you should be very explicit about what realm you're creating objects in. Just inheriting a default is not going to work out well in general, I think.

IMO it's better to just state the realm when creating the object, and rely on ambient defaults as little as possible---only for the most obvious of cases. We should not be trying to create more situations that have ambient defaults available in non-obvious ways.

jyasskin commented 7 years ago

Given:

  1. Let |x| be a new object.
  2. In parallel:
    1. Let |y| be a new object.

Do you have any cases in mind where you'd want |x| and |y| to have different globals?

domenic commented 7 years ago

Yes, definitely. For example if |y| was an event that was going to be dispatched to a different global. The various postMessage specs do this, for example.

jyasskin commented 7 years ago

Cool, thanks for the example. https://html.spec.whatwg.org/multipage/comms.html#dom-window-postmessage creates the events inside a task, rather than in parallel, and it'd be straightforward to reset the default when queuing a task...

My main concern is that verbosity and repetition hurt readability. If we insist that every object creation in parallel code specify a realm, and that realm is the same for most calls, it makes it harder to follow the rest of the algorithm and obscures the few cases where the realm is importantly different.

Are we differing in how often we expect parallel code to need to change the realm? Most of the new specs I'm seeing do a lot of work in parallel but don't shift realms much if ever. When you and @bzbarsky go through existing specs to specify their new objects' realms, do you tend to find that you'd want them to shift realms when they go parallel?

bzbarsky commented 7 years ago

In general, staying in the same realm when you go parallel makes sense. The main footguns with object creation after going parallel are:

  1. Your global can get torn down around you (e.g. iframe.remove()). It still exists, but maybe not in the state you expect it to be in.
  2. If the object creation has observable side-effects, or is otherwise synchronously observable, it needs to happen off a task, not directly in the parallel section.

Apart from that, using the same global unless explicitly stated otherwise makes sense to me.

annevk commented 7 years ago

I don't understand why you'd ever create an object in parallel. Object creation should always happen in an environment with a global. You can schedule a task that ends up creating the object. Otherwise you're inventing new JavaScript semantics and we've been told off for that many times.

(FWIW, I think "in parallel" is a bit of an anti-pattern as it often doesn't deal well with races or the constraints of that parallel place. Having explicit task-based isolated-threads would be much clearer.)

bzbarsky commented 7 years ago

I mean, in practice scheduling a task is the right thing to do, and might happen implicitly (e.g. if you do your work off a promise getting resolved). But you then need to define which global to use anyway....

jyasskin commented 7 years ago

It'd be fine with me if we wrote somewhere that objects cannot be created from parallel contexts, and I agree with @annevk's goal to replace "in parallel" with better-specified threads.

I tentatively propose the following resolution to this issue:

  1. WebIDL defines an ambient "object-creation realm" that's available (or null) for each step of an algorithm, and which propagates to the next and contained steps by default.
  2. WebIDL defines "new" or "create" or "new object" to use that realm. The "new" phrasings are likely to work better in "let x be" phrases in prose.
  3. WebIDL initializes the object-creation realm as @domenic said in the original post.
  4. HTML defines "in parallel" to set the object-creation realm for the parallel steps. We could set it to the caller's realm in order to make most existing specs DTRT or to null if we want to start enforcing that objects aren't created from limbo.
  5. HTML defines "queue a task" to take an explicit realm in addition to the task queue, and uses that for the task's object-creation realm. We should mention that specs often forget to pass this, and that their intent tends to be to propagate the caller's realm, but that can be deprecated.
domenic commented 7 years ago

I still strongly object to trying to make object creation realm implicit. I think specs should use an explicit realm whenever possible. Ideally I would prefer not even having the exceptions outlined in the OP, but they are there as a compromise. This is something spec authors should be aware of and making conscious decisions on.

I'd like to get us to a state where things are done explicitly in a widespread way. If we find this leads to a lot of repetition then maybe we can consider setting some defaults. But my suspicion is that it's generally not going to be obvious which realm to create things in when operating in parallel/inside a task.

I'd like us to try this before considering a radical step such as implicit propagation of object creation realms.

jyasskin commented 7 years ago

I think the object creation realm is one of the things that spec authors are just not going to consider, and so if we don't default it, we'll just wind up with underspecification. However, I'm happy to take the consensus subset for now. Does everyone like the idea to define the object-creation realm and "new object" term, and have "in parallel" and "queue a task" kill off the default? I can't send a PR until next week, but I'll try to do it then.

annevk commented 7 years ago

I think that's why we should formalize object creation. So that it's an easy-to-review thing in standards and something implementers can more easily flag. Perhaps even lintable some day. The main problem at the moment is the lack of a formalized approach (for many things) where everyone just copy-and-pastes things that mostly work if you squint hard enough without really thinking through the implications (and without being able to think them through since we don't even say what "new" means or link it).

tabatkins commented 6 years ago

Agreed with at least formalizing object creation - we've got a lot of object creation in the Typed OM spec, for instance, and didn't realize that we had to specify the global for each. (BZ is flagging every instance of it for us. ^_^)

domenic commented 6 years ago

So, on the subject of current vs. relevant. A couple weeks ago, a conversation with @TimothyGu had me somewhat-convinced that current is a better default than relevant, despite our upthread resolutions. Similarly, @bzbarsky and I discussed a different case at https://bugs.chromium.org/p/chromium/issues/detail?id=832273 and he was kind of leaning toward current, by my reading.

The argument is basically that we've over-applied the lesson of navigator.getBattery(). The lesson there was that for persistently created-and-stored objects, relevant is better; see the reasoning in the HTML spec example here (scroll down to "One reason why the relevant...").

Generalizing that to all cases was, perhaps, a mistake. The getBattery() case is particularly weird because it lazily creates and stores the promise in question. A more normal spec with persistent objects (e.g. the persistent navigator object that hangs off of window) would just create the objects ahead of time. And indeed, the getBattery() spec could be phrased that way; the lazy-creation seems like an implementation optimization that made its way into the spec.

Arguments for current instead as the default:

We may still need a caveat about persistent objects, but making current the default seems like a good idea. Thoughts, @bzbarsky?

I'm not sure I'll actually make time to formalize object creation and get this straightened out as something we can point people to. But at the very least we could update this thread's title to "current", and update the HTML spec's advice about relevant-by-default.

bzbarsky commented 6 years ago

There are some complications around this I need to discuss with @bholley before I can commit to a specific option here. Unfortunately, he's on vacation for a bit. I will follow up once he gets back....

yuki3 commented 6 years ago

qq about Domenic's comment, are you suggesting to use the current realm by default but continue to use the relevant realm for existing use cases? For example, document.createElement.call(doc2, 'elem') should create a new object in the relevant realm or the current realm?

domenic commented 6 years ago

Hmm, good question, I'm not sure... I note that the spec for createElement doesn't actually state a realm? Although it has some pretty close ties to the relevant realm via the custom element registry, so maybe it should use relevant?

yuki3 commented 6 years ago

Thanks for the reply. I see. I continue to watch the discussion, and I'm happy to follow the conclusion once experts make it.

bzbarsky commented 6 years ago

So here is my take on this.

1) For created-and-cached objects, we want to use the relevant realm. Whether they're lazily-created or eagerly-created shouldn't matter here; in the latter case the relevant realm is the only one that can be used anyway.

2) For newly-created-and-returned objects, I think it's generally best to do the same thing, for consistency.

3) My previous concerns were basically around newly-created Promise return values. Those are a bit special in the sense that they're really control flow reified as objects. But I thought about this some more, and we'd generally want to create the promise resolution value in the relevant global, and in that case it's probably best to create the Promise there too.

I realize this is not what the ES spec does, not least because ES has no concept of "relevant global". ES doesn't even do this for the "create and cache" case, which is pretty weird. Given that, I really don't think we should aim for aligning with ES here (as in, I consider its behavior to just be wrong in a bunch of cases).

littledan commented 5 years ago

https://github.com/heycam/webidl/issues/135#issuecomment-411109348 is pretty convincing to me. I don't really understand the intuition for using the relevant realm, even for lazily created objects. If you were writing this code in JavaScript in the normal way, closing over the class whose instance you're creating, you'd end up with an instance in the current realm (even if this comes through a callback/after an await).

littledan commented 5 years ago

Oh, I think I understand, if it's lazily cached, it'd be weird if the realm were chosen just based on who randomly asked for it first.

TimothyGu commented 4 years ago

Another issue is static methods. At least with Chrome and Firefox and at least with DOMRect, static factory functions use the current realm for the newly created object:

// $0 is a same-origin synchronously-accessible iframe
rect = $0.contentWindow.DOMRect.fromRect.call(DOMRect);
rect.constructor === $0.contentWindow.DOMRect // true

Firefox also has this weird bug where rect instanceof DOMRect and rect instanceof $0.contentWindow.DOMRect both return true, but the .constructor check seems to work as expected.

bzbarsky commented 4 years ago

Per current Web IDL spec, static methods aren't even passed a "this" at the post-IDL-spec level, so don't have a concept of "relevant global" or "relevant realm".

TimothyGu commented 4 years ago

That's quite fair, and I would be satisfied if this is spec'd as such. However, newer ES primitives like Array.from and Promise.all use the this value to determine what to do, for subclassing support. That sounds like a bit too much to ask from the web platform though.

annevk commented 4 years ago

We did have the ambition to support subclassing of platform objects and make that work nicely in a way that's compatible with JavaScript, but that kind of wavered and in general it does not seem like subclassing has taken of.

bzbarsky commented 4 years ago

We could change static methods to pass along "this", sure. I'm just saying that right now they do not.

domenic commented 3 years ago

@annevk tried tackling this again recently. Some discussion in https://freenode.logbot.info/whatwg/20210203#c6775366 .

Here's one proposal, motivated largely by the fact that we still haven't solved this. (So, my previous appetite for making things explicit is dampened.)

(*): what does "the current realm" and "the relevant realm" mean when you're inside a queued task? We want to formalize its intuitive meaning: whatever the current/relevant realm was when the original main-thread task kicked off.

Proposal:

bzbarsky commented 3 years ago

Goal: all promises, exceptions, dictionaries (Objects), sequences (Arrays), records (Objects), iterator results, ... are created in "the current realm

Does that include promises that are then cached?

domenic commented 3 years ago

Yeah, although I'm not sure how many of those there are (if any)? I think the concern there is not very large, compared to the cost of inconsistency if we continue using current for so many other promises. (Notably, all implementations today use current at least for for synchronously rejected promises, and probably more promises than that to the extent bindings code is involved.)

bzbarsky commented 3 years ago

This is most critically a concern for Promise-valued attributes (e.g. FontFace.prototypeloaded,Document.prototype.documentReadyForIdle`, etc). The consistent thing to do for those is probably to require, per spec, for those Promises to be created at the same time as the object the attribute is on, in that object's relevant global. Implementations that then want to optimize and lazily allocate in the getter can do that, as long as they allocate in the correct global.