eclipse-platform / eclipse.platform.swt

Eclipse SWT
https://www.eclipse.org/swt/
Eclipse Public License 2.0
118 stars 139 forks source link

Multiple tests failing in Test_org_eclipse_swt_browser_Browser #1564

Open fedejeanne opened 3 weeks ago

fedejeanne commented 3 weeks ago

These 3 tests fail on Linux. I noticed in this run of https://github.com/eclipse-platform/eclipse.platform.swt/pull/1496 and I was also able to reproduce it locally using WSL2:

It seems to be some issue with JavaScript (just my guess), here are some screenshots (WSL2):

image

image

image

All 3 tests timed out.

jukzi commented 3 weeks ago

"More" is a comparison. Did you mean "Multiple" which is a numeric? Is the failure always reproduceable or random?

fedejeanne commented 3 weeks ago

"More" is a comparison. Did you mean "Multiple" which is a numeric? Is the failure always reproduceable or random?

I originally meant "more" because I wanted to mention the other tests that are failing (for which there is already an issue) but I changed my mind to avoid confusion. I changed the title now, thank you for the pointer.

The tests fail very consistently when I run them locally. I am not sure about the failures in the pipelines though, I've only seen them failing in my own PRs.

HeikoKlare commented 2 weeks ago

I can reproduce the failures locally on Ubuntu 24.04.1, Webkit 2.46.1. Tests fail with both X11 and Wayland. When debugging the code, I found that the JavaScript used to open a child window is not even executed. JavaScript is properly enabled for the browser.

When I use the WebKit browser inside Eclipse and navigate to an HTML file that contains the same code as passed to the browser in the test, a child window pops up as expected. So may the problem be related to some (new) restriction regarding what WebKitGTK#webkit_web_view_load_html() (called by Browser#setText()) is allowed to do?

HannesWell commented 2 weeks ago

In tonight's I-build these tests now fail for all java versions 17,21 and 23. In the last night's it only failed for 17: https://download.eclipse.org/eclipse/downloads/drops4/I20241107-1800/testResults.php

I cannot tell if this is related to switching to Ubunut based images for Linux (https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2522) or if this is all just coincidence.

akurtakov commented 2 weeks ago

In tonight's I-build these tests now fail for all java versions 17,21 and 23. In the last night's it only failed for 17: https://download.eclipse.org/eclipse/downloads/drops4/I20241107-1800/testResults.php

I cannot tell if this is related to switching to Ubunut based images for Linux (eclipse-platform/eclipse.platform.releng.aggregator#2522) or if this is all just coincidence.

I believe it's related (e.g. different webkitgtk version) but failing constantly is better than randomly.

HeikoKlare commented 6 days ago

I can now confirm that an update of webkit introduced these test failures. I just downgraded from version 2.46.1 to 2.44.0 and the tests run fine again. More precisely, I downgraded the following libraries on an Ubuntu 24.04 system (target version: 2.44.0-2, original version: 2.46.1-0ubuntu0.24.04.1):

akurtakov commented 6 days ago

I can't notice anything in webkitgtk news file but it doesn't seem to contain changes in webkit engine itself but only changes to the "bindings" aka it's still to be figured where to find the changes in the engine itself.

HeikoKlare commented 6 days ago

Indeed. I have not yet understood where the event flow stops. It does not seem to be missing JavaScript execution (as I indicated first), but rather with subsequent event processing. I also found a warning being logged with the current WebKit version that may direct to the problem (the "child" browser not being instantiated correctly):

** (SWT:32751): WARNING **: 09:27:41.245: WebKitWebView returned by WebKitWebView::create signal was not created with the related WebKitWebView

I have extracted one of the test cases into a standalone snippet (but the essence of the snippet is the same for all failing tests, so the root cause should be the same):

public class OpenChildBrowser {
    public static void main(String[] args) {
        Display display = new Display();
        final Shell shell = new Shell(display);
        shell.setLayout(new FillLayout());
        final Browser browser = new Browser(shell, SWT.NONE);
        Shell childShell = new Shell(shell, SWT.None);
        childShell.setText("Child shell");
        childShell.setLayout(new FillLayout());
        final Browser browserChild = new Browser(childShell, SWT.NONE);
        browser.addOpenWindowListener(event -> {
            System.out.println("Open Window Listener");
            event.browser = browserChild;
        });
        browserChild.addVisibilityWindowListener(VisibilityWindowListener.showAdapter(event -> {
            System.out.println("Open shell");
            childShell.open();
            browserChild.setText("Child Browser");
        }));
        shell.open();
        browser.setText("""
                <html>
                    <script type='text/javascript'>
                        var newWin = window.open('about:blank');
                    </script>
                    <body>
                        This test uses javascript to open a new window.
                    </body>
                </html>
                """);
        while (!shell.isDisposed()) {
            if (!display.readAndDispatch())
                display.sleep();
        }
        display.dispose();
    }
}

The flow stops between processing the window opening event of the "main" browser (setting the event.browser to a child browser) and the listener on the child browser. Note: Similar implementations may be used in productive code and not only in tests, so this may present as an actual regression for consumers and not only as test failures. I do not see that there is something wrong/illegal in the test code. That's why I also changed the issue title to cover that this is not "only" a test issue. Feel free to improve the title with something more fitting/expressive.

This is the log output of above snippet for WebKit 2.44.0 (the second line is an added println to WebKit#webkit_web_view_ready()):

open window listener
webview ready
open shell

This is the log output of above snippet for WebKit 2.46.1:

open window listener

** (SWT:32751): WARNING **: 09:27:41.245: WebKitWebView returned by WebKitWebView::create signal was not created with the related WebKitWebView

Unfortunately, I can currently not put more effort into this. Whoever wants to take over may find these commands useful (if on Ubuntu 24.04) to switch between the two WebKit versions for testing purposes:

Switch to current WebKit (2.46.1):

apt-get install libwebkitgtk-6.0.4=2.46.1-0ubuntu0.24.04.1 libjavascriptcoregtk-6.0.1=2.46.1-0ubuntu0.24.04.1

Switch to previous WebKit (2.44.0):

apt-get install libwebkitgtk-6.0.4=2.44.0-2 libjavascriptcoregtk-6.0.1=2.44.0-2 libwebkit2gtk-4.1.0=2.44.0-2 libjavascriptcoregtk-4.1.0=2.44.0-2
fedejeanne commented 3 days ago

I was also looking at this and debugging side to side WebKit 2.32.4 (Ubuntu 18.04.5 LTS) vs WebKit 2.46.1 (Ubuntu 24.04 LTS). All I can tell is that for 2.46.1 some WindowEvent never reaches the queue Synchronizer::messages and therefore the VisibilityWindowListener created in this snippet will never be called. This WindowEvent is supposed to be created right after the OpenWindowListener from the snippet is executed but that is exactly where the error pops up:

** (SWT:32751): WARNING **: 09:27:41.245: WebKitWebView returned by WebKitWebView::create signal was not created with the related WebKitWebView

Looking at the documentation of create, webkit_web_view_new_with_related_view and WebKitProcessModel and judging by the fact that set_process_model has been deprecated in version 2.40 my best guess is that something changed in the default values and now one has to somehow bond the parent webview and the child webview together, as it is mentioned in the documentation of create...

When using WEBKIT_PROCESS_MODEL_MULTIPLE_SECONDARY_PROCESSES process model, the new WebKitWebView should be related to web_view to share the same web process, see webkit_web_view_new_with_related_view() for more details.

... which wasn't necessary before. But then again, this is just a guess and it relies on the fact that the default value of the WebKitProcessModel was WEBKIT_PROCESS_MODEL_SHARED_SECONDARY_PROCESS.

If someone knows how to check if older versions of webkit were in fact using WEBKIT_PROCESS_MODEL_SHARED_SECONDARY_PROCESS while the newer version uses WEBKIT_PROCESS_MODEL_MULTIPLE_SECONDARY_PROCESSES then please share this info here.

That's as far as I could get. Hopefully someone with more knowledge can help understand and solve this issue.

akurtakov commented 1 day ago

I have tried using WEBKIT_PROCESS_MODEL_SHARED_SECONDARY_PROCESS (via calling webkit_web_context_set_process_model) on webkitgtk 2.46.3 but it failed(no change in behavior) with :

** (SWT:1131600): WARNING **: 10:43:14.640: WEBKIT_PROCESS_MODEL_SHARED_SECONDARY_PROCESS is deprecated and has no effect
fedejeanne commented 1 day ago

Thank you for testing it!

Would it be possible for you to bind the method get_process_model and provide the output for the old and the new version of WebKit?

If not, any hint on how to bind the method myself would be appreciated. If I can bind it and call it, I can provide the output myself, but binding is sadly still beyond my ken 😅

akurtakov commented 1 day ago

@fedejeanne https://github.com/akurtakov/eclipse.platform.swt/commit/e5e9387408836842aed9f383681d337707015084 (edited to point to fixed commit) and LOCAL REBUILD OF NATIVES is all you need. As per https://webkitgtk.org/reference/webkit2gtk/stable/enum.ProcessModel.html - printing 0 is shared, 1 is multiple processes. On my webkitgtk 2.46.3 I get "Process Model:1".

fedejeanne commented 1 day ago

@akurtakov thank you for the help, I got it to run and I called get_process_model.

I tried with the following versions of WebKit:

All of them in Ubuntu 24.04.

This is what I found out:

The last 2 points show that using

if (browser != null && !browser.isDisposed ()) {
    webView = WebKitGTK.webkit_web_view_new_with_related_view( ((WebKit)browser.webBrowser).webView);
    return webView;
}

... instead of

if (browser != null && !browser.isDisposed ()) {
    return ((WebKit)browser.webBrowser).webView;
}

... does not work, it actually breaks the older implementations too.

But maybe you could try something in the same direction and see if we can successfully bind the child and the parent browser?

akurtakov commented 1 day ago

Oops, the part that broke for you is a local experiment (totally unsuccessful for now) that was not supposed to end up in this commit.

fedejeanne commented 14 hours ago

Oops, the part that broke for you is a local experiment (totally unsuccessful for now) that was not supposed to end up in this commit.

No worries, it was easy to spot it and I appreciate the experimentation you're doing :-)