origin-isolation tests need to be run in separate browsing context groups

domenic commented 4 years ago

Origin isolation has a browsing-context-group wide scope, for the reasons discussed here.

This means that if one of the origin isolation tests runs, and then the test runner navigates to a second origin isolation test, the second test will be impacted by what happens in the first test, invalidating the results.

I'm not sure how the test runner infrastructure works. Is there a guarantee that my tests will be run in separate browsing context groups? (E.g., separate top-level windows, with no opener relationships between them---including no common opener.)

Hexcles commented 4 years ago

separate top-level windows, with no opener relationships between them---including no common opener

I'd assume that's the case. @jgraham can you confirm?

Hexcles commented 4 years ago

(But for sure we don't restart the browser.)

domenic commented 4 years ago

It appears to not always be the case when using vpython third_party/blink/tools/run_web_tests.py in Chromium though, so perhaps this would become a crbug.

Hexcles commented 4 years ago

It appears to not always be the case when using vpython third_party/blink/tools/run_web_tests.py in Chromium though, so perhaps this would become a crbug.

That's highly possible. content_shell --run-web-tests manages lifecycles in a completely different and messy way...

domenic commented 4 years ago

I'll close this, since it seems like WPT is probably not at fault. On the blink side, I've resorted to using --restart-shell-between-tests=always when running the tests locally. On the test bots, they seem to not need that.

stephenmcgruer commented 4 years ago

From https://crbug.com/1097980 (and the associated PR https://github.com/web-platform-tests/wpt/pull/24259), it seems like maybe we are not putting tests in a separate browsing context, at least for stability checks? This needs to be investigated (unless @jgraham happens to know).

If its just stability checks (as opposed to between different tests), it may not be too bad.

stephenmcgruer commented 4 years ago

Right, so for stability checks we run something akin to the following, which reproduces the problem:

./wpt run --no-restart-on-unexpected --rerun=2 --channel=experimental chrome origin-isolation/removing-iframes.sub.https.html

The -no-restart-on-unexpected means we do the following:

Open https://web-platform.test:8443/testharness_runner.html in the browser
Call window.open('about:blank', '%s', 'noopener')" % self.window_id)
Load the test URL in the new window (well, tab)
After test, close the tab.
Again call window.open('about:blank', '%s', 'noopener')" % self.window_id)
Load the test URL in the new window (well, tab)

So the original https://web-platform.test:8443/testharness_runner.html tab becomes a common opener, I presume.

Hexcles commented 4 years ago

So the original https://web-platform.test:8443/testharness_runner.html tab becomes a common opener, I presume.

Noob question: doesn't noopener mean the new tab doesn't have an opener?

stephenmcgruer commented 3 years ago

Domenic and I chatted about this today. In general noopener 'should' work, but Domenic put forward the reasonable argument that artificially severing the opener relationship is a little bit dicey when one is trying to test things at the window.open() and noopener layer itself.

Refreshing my mind on how this works - we actually reuse the same testharness harness tab where possible between all tests I believe, not just when we are repeating tests. We only reload or change it when we need to go between http/https, or if something else in the setup changes between tests. So one could also reproduce the problem for Chromium by having two tests doing the same thing, without using --rerun=2.

One possibility would be to use webdriver to fully reload or replace the testharness harness tab between each test, but I suspect there is concern about the speed of that. Another possibility would be to introduce (yet another) file-name flag to force webdriver to fully reload before running such tests.

I think I lean towards a special flag, but I'm still not clear on the concrete cases where using noopener is truly problematic. For crbug.com/1099718 its technically the case (as far as I read it) that Chrome has a bug with noopener, not that noopener shouldn't work for the feature under test (origin isolation).

cc @jgraham @gsnedders @Hexcles, and @domenic

domenic commented 2 years ago

https://github.com/web-platform-tests/wpt/issues/33590 may be another issue similar to this.

domenic commented 11 months ago

Close watcher tests are sort of running into this, at least on Chromium CI. They are thrown off if the window in question has user activation when the test starts, which seems to be the case sometimes in Chromium CI. The result is that Chromium CI will restart the test and the second time it will pass, because it isn't contaminated by other tests messing with user activation.

I presume other tests relating to user activation will run into this as well.

@marcoscaceres's https://github.com/web-platform-tests/wpt/pull/37176 would probably help here, but it seems stalled. (Also, technically, we'd need a new API to consume history-action activation for those, which means a whole new iteration of the Process which stalled out last time.)

marcoscaceres commented 7 months ago

Sent https://github.com/web-platform-tests/wpt/pull/40699 ... but still needs review (and I guess someone to wire up the bits for the respective browsers)

jgraham commented 6 months ago

So there seem to be two slightly different (albeit related) issues being discussed in the thread, browsing context group isolation, and user activation.

In terms of browsing context groups, for testharness tests wptrunner both the WebDriver and marionette executors close the old test windows after each test is completed, and the new window that's opened is expected to be a fresh browsing context group (I was going to say this is controlled by the close_after_done option, but it looks like that's ignored and we close the windows unconditionally in those executors; it seems this changed with the testdriver implementation, but I don't really know why).

So I think, defacto, we always get a new browsing context group for each test when using those implementations. But it isn't guaranteed. So we have two options:

We could document it as a guarantee and say that implementations which try to optimise perf by reusing the same browsing context (group) for multiple tests are wrong. This has the advantage of leaking the smallest amount of state between tests.
We could add metadata to tests (and therefore the MANIFEST.json) file which identifies tests which require a new browsing context group, so that implementations which want to usually reuse the same browsing context for different tests can have more conservative behaviour for those tests which specifically require a new browsing context group.

WeizhongX commented 6 months ago

@domenic, looks like we will always get a new browsing context group for testharness test on Chrome. Can you explain why you are still seeming a problem? Are you requesting such capability on other test types also?

@jgraham, thanks for put in the comments. Maybe we could measure how much more time it will take to create a new test window then decide which way we want to go? I will give that a try.

domenic commented 6 months ago

So there seem to be two slightly different (albeit related) issues being discussed in the thread, browsing context group isolation, and user activation.

Right. To be clear, solving browsing context group isolation would also solve user activation, but the other way around does not hold.

In terms of browsing context groups, for testharness tests wptrunner both the WebDriver and marionette executors close the old test windows after each test is completed, and the new window that's opened is expected to be a fresh browsing context group

The other thing required for browsing context group isolation is that there is no opener relationships. Do you know if that is the case? I believe at some point there was a sort of "controller" window, which opened the tests in other windows using window.open() or something like it, and this meant that all the tests were in a shared browsing context group.

@domenic, looks like we will always get a new browsing context group for testharness test on Chrome. Can you explain why you are still seeming a problem? Are you requesting such capability on other test types also?

I cannot explain why I am still seeing a problem. All I know is that there are at least two test suites in Chromium which I have had to move to virtual test suites with --reset-browsing-instance-between-tests to get them to pass.

It's also worth noting that we had one large set of tests which were failing on wpt.fyi, but we worked around them by making the tests uglier in https://github.com/web-platform-tests/wpt/commit/0a5ef44bf2e1c9be8f50beee9936771d5adcd35b . I'd suggest checking out that directory prior to that commit and debugging to see what is going wrong.

web-platform-tests / wpt

origin-isolation tests need to be run in separate browsing context groups #23364