Incumbent seems broken for pending promises

domenic commented 4 years ago

I was working on fixing entry (https://github.com/whatwg/html/issues/1426) but @syg pointed out that incumbent is not quite right either. @bzbarsky and I spent a lot of time on this in #1189 so it's possible it works in some non-obvious way, but as of right now it seems broken to me.

The specific problem is that EnqueueJob's choice of incumbent settings is wrong when job is a PromiseReactionJob that is being queued as a result of TriggerPromiseReactions (i.e., because someone did a resolve() on a pending promise). PromiseResolveThenableJobs work fine, as do PromiseReaction jobs triggered by PerformPromiseThen immediately enqueuing jobs (because someone did a .then() on a settled promise.)

In particular, to match Web IDL semantics (and the general expectations around how incumbent behaves) we want to capture the incumbent realm at the time promise.then(handler) is called, and then, when we run the PromiseReactionJob, prepare to call a callback using that settings object. But right now we just grab the incumbent at job-enqueuing time.

Fixing this seems like it needs hooks deeper into the JS spec. (Even more deep than the arguments-introspection of #5212.) In particular we need a chance to store the incumbent in the [[Handler]] data structure in PerformPromiseThen.

@bzbarsky, can you check my analysis? Does Firefox implement this? Maybe we decided that it's OK for incumbent to be broken in these cases?

/cc @littledan as he may be able to help us with this cross-spec integration work.

bzbarsky commented 4 years ago

I will not be able to get to this until Monday next week, unfortunately (have some pre-existing commitments). If you haven't heard from me by end of (US East Coast) day then, please ping me!

domenic commented 4 years ago

For what it's worth I wrote tests in https://github.com/web-platform-tests/wpt/pull/21206, assuming Web IDL-like semantics, which seem to pass in both Chrome and Firefox.

In Chrome's case, this is surprising; based on @syg's code inspection we seem to be doing something rather different. Maybe if I just introduce even more realms (e.g. promise constructor realm is currently the same as entry realm) we'll find issues...

Edit: we found that cases depending on the backup incumbent settings object stack fail in Chrome.

syg commented 4 years ago

(As an aside, is realm 1:1 with "settings object"? Saying "settings object" is rather verbose.)

domenic commented 4 years ago

Yep, see https://html.spec.whatwg.org/#realms-settings-objects-global-objects

syg commented 4 years ago

In Chrome's case, this is surprising; based on @syg's code inspection we seem to be doing something rather different.

For Chrome, what I think is going on is that enqueuing a Promise microtask always saves the isolate's current realm. For pending Promises, the current realm at the time of enqueuing the task is the resolve function's realm, since the resolve function is at the top of the stack. In other words, it'll always be the Promise's creation realm. I believe that explains the Chrome failure in web-platform-tests/wpt#21206: for pending promises, the incorrectly saved incumbent realm is the fulfilled Promises's creation realm.

What @domenic has pointed out, and what I agree with, is that this is super edge casey. A difference is observed only when there is no user code on the JS stack, like when builtins are bound with .bind and directly passed to then. Otherwise, the incumbent will end up being the realm of the JS function at the top of the stack (barring yet another edge case of triggering callbacks directly from script...)

This also raises the point that since Chrome/V8 is buggy here and that is unlikely that the incumbent being used in this way in Promise callbacks is depended upon in the web, and that we still have existing sentiment to simplify the incumbent concept, is there possibility of change here? For instance, just saving the callback's creation realm instead of the incumbent.

domenic commented 4 years ago

that we still have existing sentiment to simplify the incumbent concept

Well, we'd like to remove uses of it. I don't think introducing two slightly-different incumbent concepts, one used by promises and another by all other callbacks on the platform, would be much of a simplification.

littledan commented 4 years ago

My thoughts are tending in the direction of @syg's comment. Applying HTML's concept of "incumbent" to JS here would complicate the specification and implementation in certain ways that I don't understand the motivation for (even if some implementations already do it). How would a web developer run into this sort of issue?

A bit of a tangent, but in addition to Promises and WeakRefs, would the idea of threading incumbent realm through be relevant for iterator helpers (on async iterables) and Observables?

domenic commented 4 years ago

I don't understand the motivation for (even if some implementations already do it). How would a web developer run into this sort of issue?

Web developers run into this issue whenever they use incumbent-dependent APIs. See https://github.com/whatwg/html/issues/1430 for a relatively-complete list.

A lot of the uses there can be considered "legacy", i.e. we don't like that they're using incumbent, and would like to switch to relevant or current if web compatible. However, there are some notable ones where incumbent makes sense, notably the security checks involved in navigating via the location object, and figuring out what the "source" is for cross-document messaging APIs like postMessage(), MessageChannel, or BroadcastChannel.

In particular, consider a case like

someOtherWindow.postMessage.call(someThirdWindow, ...args);

This will eventually fire a MessageEvent inside someThirdWindow, with a source property that indicates where the message comes from. This is used for e.g. communicating back to the sender. This source property is derived from the incumbent. And doing so makes more sense than using someOtherWindow or someThirdWindow; although those mechanisms were used to send the message, it's really the code in the incumbent window that is sending the message.

Currently, if you send postMessage()s from inside any deferred callback, the incumbent (and thus the source) will be correctly determined in Firefox, and in the specs modulo this bug, and in Safari and Chrome modulo certain edge cases involving combinations of bind() and promises. I argue we should move toward the Firefox behavior, because I don't think we should make the source of postMessages unreliable when you use promises to defer your callback, if it is reliable in all other cases.

So to be concrete about the web developer impact:

If we stopped tracking incumbent for promises entirely, then code like Promise.resolve().then(() => someOtherWindow.postMessage(...args)) would give a different resulting source property than someOtherWindow.postMessage(...args).
If we changed the definition of incumbent in a way that is simpler to implement (I think this is what @syg is suggesting), then code like Promise.resolve().then(() => someOtherWindow.postMessage(...args)) would give a different resulting source property than Promise.resolve.then(someOtherWindow.postMessage.bind(someOtherWindow, ...args)).

A bit of a tangent, but in addition to Promises and WeakRefs, would the idea of threading incumbent realm through be relevant for iterator helpers (on async iterables) and Observables?

Great find. I think it would. This argues that we either need general host hooks for all callback storage and use (I think it only matters for potentially-async callbacks?), or we should try to move the entry and incumbent concepts down into the ES layer, as something that both the ES spec and hosts maintain, but hosts take advantage of.

bzbarsky commented 4 years ago

To the extent that anyone ever examines an incumbent realm, the idea is that any time a function that examines it is captured in some way the incumbent should be captured as well.

like when builtins are bound with .bind and directly passed to then

Indeed. The idea is that p.then(someWindow.postMesssage.bind(stuff)) should not allow you to spoof who the sender of the message is. In this case it should basicallly be someone who could legitimately have been the caller of the postMessage method. In practical terms, the IDL spec makes it be the caller of then (if you replace then with a function taking an IDL callback), since that caller could have just called the bound method anyway.

We could, I guess, move the incumbent handling into bind(), but it seems like that still needs ES spec changes and would slow down bind() in general; at least right now only places that need to pay the cost (i.e. IDL callbacks that can get invoked later with no script on the stack) pay it.

syg commented 4 years ago

If we changed the definition of incumbent in a way that is simpler to implement (I think this is what @syg is suggesting), then code like Promise.resolve().then(() => someOtherWindow.postMessage(...args)) would give a different resulting source property than Promise.resolve.then(someOtherWindow.postMessage.bind(someOtherWindow, ...args)).

Yep, that's what I was suggesting.

It still strikes me as odd that, if the ultimate motivation is to counter spoofing for a specific set of APIs, that the countermeasures are not built into those APIs but into all callbacks.

bzbarsky commented 4 years ago

How would you do it in the API?

syg commented 4 years ago

I see your point. I can imagine something fairly intrusive, such as instead of building the logic into F.p.bind, make such APIs have something like a IncumbentSavingFunction on their prototype chain that have the special bind.

Edit: missed operative "instead of"

littledan commented 4 years ago

This is a bit of a tangent, but it's hard for me to see what you could do to bind here. For example, you could get the same "calling with another receiver" behavior by doing someWindow.spoofPostMessage = otherWindow.postMessage and then just call someWindow.spoofPostMessage as a method.

I'm trying to understand the strength of this "no spoofing" goal. We're talking about similar-origin windows that could reach into each other's globals and do all kinds of manipulation, right? This feels like something nice for regularity, but is it intended to be considered a security property?

bzbarsky commented 4 years ago

For example, you could get the same "calling with another receiver" behavior

The issue is not the receiver. The issue is what the callee perceives as "the caller".

I'm trying to understand the strength of this "no spoofing" goal.

That is an interesting question, yes.

bzbarsky commented 4 years ago

I will not be able to get to this until Monday next week

Er, Tuesday. I forgot that Monday is a US holiday.

bzbarsky commented 4 years ago

OK, so I am still trying to page all the Promise stuff back in. @jswalden or @tschneidereit might have this more at the top of their heads, maybe.

Looking at the Gecko code, I think it works like this:

1) EnqueuePromiseReactionJob (which I hope does what it says on the label) gets an incumbent global that was initially stored in the reaction and stashes it in the job information. That is, the PromiseReactionRecord struct in Gecko stores an incumbent global, captured at the point when the PromiseReactionRecord is created, and this is used as the incumbent for the reaction job.

2) EnqueuePromiseResolveThenableJob grabs the then-current incumbent and stashes it in the job.

That is, this basically does implement WebIDL semantics, as far as I can see.

domenic commented 4 years ago

Thanks Boris! That's what I suspected. I'd really like to do that at the spec level too, and in Chromium and WebKit, so that promises aren't weird here.

@syg and @littledan, does that seem like a reasonable medium-term goal? I understand it's probably not super high priority to do the kind of ES262 spec work, or V8 work, that would be involved in threading the incumbent through here. (And I don't think this kind of stuff should block work on WeakRef callbacks.) But I'd like to get a sense as to whether it's acceptable as a goal, even if not as something on the roadmap.

littledan commented 4 years ago

[I'd like to defer this to @syg / multiple engine support; I feel like I'm still missing some understanding here, even after @bzbarsky 's patient explanations.]

syg commented 4 years ago

My reading of the V8 code is that correctly threading an incumbent global through is possible with a little work. Isolates already have a GetIncumbentContext() method and there's an RAII thing embedders can use to deal with the backup incumbent context stack, but it is not clear to me that the blink side is using this correctly in all places...

Correctness aside, I have two other concerns:

ISTM looking up the incumbent realm will always incur a stack walk to see if there's a scripted caller, and if so, gets its realm. I don't love the idea of walking the stack for allocating every PromiseReaction...
Even though we don't interop, what should we do if Chrome lands this and finds some breakage?

I'm kind of on the fence. I'm not fully convinced yet of the usefulness of the incumbent concept, and it's only observable when .bind()ing builtins, so it seems pretty edge case-y. I do take the point that it's weird that Promises aren't like literally every other web API, though.

My current position is something like: more interop is always great, so I'm not against trying to implement incumbents in V8 Promises. But as @domenic has said, this is going to be pretty low priority. However, if there are performance regressions that come from this, I envision a hard time justifying engineering effort to optimize for incumbent realm tracking.

Edit: Missed an operative 'not'

whatwg / html

Incumbent seems broken for pending promises #5213