tc39 / ecma262

Status, process, and documents for ECMA-262
https://tc39.es/ecma262/
Other
14.97k stars 1.28k forks source link

`Function` and indirect `eval` should not capture context from where they are called. #3160

Open nicolo-ribaudo opened 1 year ago

nicolo-ribaudo commented 1 year ago

Description: Consider these two programs:

For any given string str, their behavior should be identical. Only direct eval should capture context from where it's being called, while Function and indirect eval (and setTimeout, even if it's in HTML) should not.

However, Function("import(...)") is defined to use the Function/(0, eval)'s caller as import()'s referrer, introducing a dynamic scope case other than direct eval.

eshost Output:

Browsers behavior varies:

You can find a test with various cases at https://github.com/nicolo-ribaudo/function-dynamic-scoping.

Proposed behavior:

In all these cases, import()'s referrer should not capture any script or module, and instead fallback to using the current realm as the referrer.

I believe that this is what is already happening in the following case, as described in https://github.com/tc39/ecma262/issues/871:

Promise.resolve("import(...)").then(eval);
bathos commented 11 months ago

Chrome implements the spec for Function and eval, but not setTimeout

If Chrome ends up implementing this fix for a faulty assertion in the timer initialization steps, would it follow that setTimeout there would also end being affected by this?

nicolo-ribaudo commented 1 month ago

Fixing this would make these two code snippets equivalent:

// mod1.js
import { call } from "./mod2.js";

function run(f, x) {
  // ... do something
  call(f, x);
  // ... do something else
}

// mod2.js
export function call(fn, arg) {
  fn.call(arg);
}
// mod1.js

function run(f, x) {
  // ... do something
  f.call(x);
  // ... do something else
}
annevk commented 1 month ago

What does this mean for window.location and other APIs that look at the caller? Would this allow you to pretend to be any realm you have access to?

nicolo-ribaudo commented 1 month ago

What do you mean by looking at the caller? window.location seems to only look at its this object, and not at the caller in the execution context stack.

annevk commented 1 month ago

I should have been more precise, I meant the https://html.spec.whatwg.org/multipage/nav-history-apis.html#dom-location-href setter. Which uses the "incumbent document" for referrer and base URL purposes.

nicolo-ribaudo commented 1 month ago

DISCLAIMER: I only learned about the difference between active realm and incumbent realm today. This answer is based on my reading of the spec but I still have to figure out how to write a test that shows what the incumbent document used by location.href is.


This PR does not directly affect those operations (when not used together with new Function, for determining the incumbent settings object HTML does not use ECMA-262's GetActiveScriptOrModule() but implements its own version (topmost script-having execution context).

However, this change is based on the assumption that HTML's usage of the incumbent document / settings object is "bad"/"legacy" (and looking at where it's used in HTML, it seems to only be old APIs?). Specifically, if this change gets consensus then it would be considered bad that these examples code behave differently:

<!DOCTYPE html>
<iframe></iframe>
<script>
  function set(obj, key, value) {
    obj[key] = value;
  }
</script>
<script>
  globalThis.set(location, "href", "https://example.com");
</script>
<!DOCTYPE html>
<iframe></iframe>
<script>
  globalThis.set = Reflect.set;
</script>
<script>
  globalThis.set(location, "href", "https://example.com");
</script>
<!DOCTYPE html>
<iframe>
  <!-- pretend this is an <iframe src=...> loading these contents -->
  <script>
    function set(obj, key, value) {
      obj[key] = value;
    }
  </script>
</iframe>
<script>
  frames[0].set(location, "href", "https://example.com");
</script>
<!DOCTYPE html>
<iframe>
  <!-- pretend this is an <iframe src=...> loading these contents -->
  <script>
    globalThis.set = Reflect.set;
  </script>
</iframe>
<script>
  frames[0].set(location, "href", "https://example.com");
</script>

From my understanding of what the incumbent settings object is, HTML defines the 3rd example to behave differently from the other ones. "Good behavior" would be that they all behave the same, because they are all setting the same property on the same object with the same value.

When setting window.location.href inside a function created with new Function, before this change the "incumbent document" would be the document of the realm of the Function constructor, while after it would be the document of whoever called the function returned by new Function.

A concrete difference for location.href is this example:

<!DOCTYPE html>
<iframe></iframe>
<script>
  frames[0].eval(`location.href = "https://exmaple.com"`);
</script>

currently the incumbent document is the outer one, while with this change it would be null and thus fallback to the backup incumbent settings object stack.

annevk commented 1 month ago

I wonder if "incumbent" is still bad in light of shadow realms. I think when we initially stopped using it we did not anticipate adding more cross-realm behavior.

If a shadow realm is handed a location.href setter and we give shadow realm creators the capability to control the base URL and such of the shadow realm, what would the expected behavior be?

cc @domenic @Ms2ger

nicolo-ribaudo commented 1 month ago

The href setter expects to receive an object as the this, but that's not possible with shadow realms. As far as I saw, there are no methods that can be used across a shadowrealm boundary (i.e. that only accept/return primitives) and check the incumbent.

I personally think these two examples should behave the same:

const descriptorSet = Object.getOwnPropertyDescriptor(location, "href").set;
const setLocation = url => descriptorSet.call(location, url);

shadowRealm.evaluate(`set => set("https://example.com")`)(setLocation);
const descriptorSet = Object.getOwnPropertyDescriptor(location, "href").set;
const setLocation = descriptorSet.bind(location);

shadowRealm.evaluate(`set => set("https://example.com")`)(setLocation);

since they are two different "styles" of writing the same code. This would mean not using the incumbent though.

domenic commented 1 month ago

Here is why the web currently depends on incumbent:

// code running on https://example.com/foo.html
const w = window.open();

w.location.href = "bar.html";

This is a common pattern. It opens a new window whose URL/base URL is about:blank. Then it tries to navigate that window to https://example.com/bar.html. But it does so using a relative URL.

If you resolve bar.html relative to about:blank, you'll get an error! What's happening is that the web developer is depending on the fact that location.href uses incumbent, and thus will resolve URLs relative to the "caller context" (incumbent window/realm/settings object).

If we were designing the web from scratch, we would not add this affordance. New APIs, like navigation.navigate(), do not use incumbent.


This is actually not the only important use case for incumbent. It's just the simplest to explain, and the one that has the clearest parallel to the active script case. Check out this document I wrote previously for more.


What does this mean for the question at hand? Well, basically there are two schools of thought.

  1. We cannot change the behavior of the above code. So, we should thread through the incumbent as faithfully as possible, so that even if you do convoluted things involving setTimeout(string), eval, Promise.resolve().then(fn), etc., your code still gets the relative URL resolution behavior. Example: setTimeout(eval, 0, "w.location.href = 'bar.html'").

  2. We only need to preserve the behavior of the above code. We otherwise should do whatever's simplest and requires the least spec contortions, since this is legacy behavior anyway.

Previously, we had been working toward (1). Firefox is very good about (1), in particular. And the spec follows them.

Over the last few years, the pendulum has swung a little bit more toward (2). Chrome and Safari have not made movements toward improving incumbent support. And I think when they went to implement finalization registry callbacks, Firefox decided to pass on incumbent support there too? Not sure.

Side note: (2) might be a little dangerous for the sourceDocument usage of incumbent (discussed in the doc), since it changes which document's security policies apply to a navigation. But, probably it's not a serious problem, since I think you can only use it to "impersonate" same-origin frames, and possibly only cooperating same-origin frames.


How does this relate to active script? Well, as you note, they're distinct mechanisms, but basically the same question. Right now active script only influences import() URL resolution. (I think.) But it raises these sorts of questions: given code like

<!-- https://example.com/index.html -->
<script src="sub/path/foo.js"></script>
// sub/path/foo.js
import("./example1.mjs");
setTimeout(eval, 0, 'import(`./example2.mjs`)');

This will definitely import https://example.com/sub/path/example1.js. But should it import https://example.com/sub/path/example2.mjs, or https://example.com/example2.mjs? This is basically the same camp (1) vs. camp (2) question as above.

In https://github.com/whatwg/webidl/pull/902 I tried to move us more toward (1) for active script, but have mostly given up. Some discussions there show no browsers propagating the active script in the way that PR suggests they should.

annevk commented 1 month ago

As long as you can set document.domain it's same-site, but yeah.

@domenic I think that means that navigate() won't set referrer as you'd expect, but that's probably okay as you have quite a bit of control over it anyway these days.

nicolo-ribaudo commented 1 month ago

Thank you so much to both of you for the context. I will present this to the next TC39 plenary, showing the options we have and what the web platform needs. I still need to finish going through all the linked resources.

I do not consider in scope any change to the machinery that lets (0, eval)(someString) and setTimeout(eval, 0, someString) behave the same, and I will propose that we should let the web platform keep doing all the "weird things" that it needs to support location.href&similar.

However, given the current implementation divergence, I will propose that when it comes to ECMA-262 API, we update eval so that (0, eval)(someString) and call(eval, someString) will always behave the same (where call is a user-defined function that does (f, x) => f(x)).

domenic commented 1 month ago

I will propose that when it comes to ECMA-262 API, we update eval so that (0, eval)(someString) and call(eval, someString) will always behave the same

I want to make sure I understand what you're proposing.

Given the following:

function call(f, x) { f(x); }

import("./foo.mjs");
eval(`import("./foo.mjs");`);
(0, eval)(`import("./foo.mjs");`);
call(eval, `import("./foo.mjs");`);

Currently all of these behave the same: they use the script's base URL.

What case are you proposing to change?

Your original post involves 4 files and I haven't been able to understand the implications. But your latest reply makes it sound like you're changing the simple cases above.

nicolo-ribaudo commented 1 month ago

I'm proposing to change the last two lines. Direct eval (eval(import("./foo.mjs");)) is meant to capture the context from where its called, so a dynamic import inside of it should behave as a dynamic import with not eval involved.

The last two lines on the other hand use indirect eval, that is meant to not capture any context about where it is being called (except for what is needed for web-platform-specific features, such as the location.href example and the use case you explained in https://github.com/tc39/ecma262/issues/78). Specifically, the "bug" is that if you move function call(f, x) { f(x); }, which is a plain function with no special behavior, to a different file then the behavior of your code changes.

domenic commented 1 month ago

OK. Well, I don't object, but I think everyone should carefully consider the implications of such changes, and of moving away from what is currently interoperable behavior, at least between Firefox and Chrome. Having all four lines behave the same is more in line with what I would expect. But it's probably web compatible to change such edge cases, if implementers are up for it.