mozilla / geckodriver

WebDriver for Firefox
https://firefox-source-docs.mozilla.org/testing/geckodriver/
Mozilla Public License 2.0
7.03k stars 1.51k forks source link

Page is not interactive until FireFox window is overlapped by another windows #2040

Closed ilya-corp closed 1 year ago

ilya-corp commented 1 year ago

System

Testcase

I have Selenium test, which perform opening of 3 instances of the browser and manipulating with elements. My problem is the following: if FireFox window is overlapped by another application (for example, MS Word is opened and overlap browsers), tested app does not working properly - it is not response for network events. BUT if I move FireFox to the front OR if I move mouse to Windows taskbar to display application mini-image, Firefox will work as expected. If I run my Selenium tests in headless mode, I also have NOT this problem. But when I run test from debugger, I need to move mouse to taskbar to kick browser. I have no this problem in Chrome.

It looks like some kind of optimization, but we should not use them during Selenium tests.

CasperHooft commented 1 year ago

I have the exact same issue.

whimboo commented 1 year ago

Could you please attach a trace-level log from geckodriver for the problematic Firefox instance? That might help to get further information. So far it's not something that I have seen during all the time I work on our WebDriver implementation. Maybe you also have a code snippet for reproduction?

Gerald94 commented 1 year ago

Could you please attach a trace-level log from geckodriver for the problematic Firefox instance? That might help to get further information. So far it's not something that I have seen during all the time I work on our WebDriver implementation. Maybe you also have a code snippet for reproduction?

trace-log-geckodriver.txt Here is the trace-log, if have exact the same problem.

whimboo commented 1 year ago

@Gerald94 can you please point to a specific line of code which fails including an element reference (id, xpath, css class name or whatever)? This would help me a lot.

Note that in your case a lot of Javascript gets executed including clicks on elements. So there is the potential risk that the framework that you are using might have a bug. There is clearly the WebDriver:ElementClick command that should be used instead.

Gerald94 commented 1 year ago

@Gerald94 can you please point to a specific line of code which fails including an element reference (id, xpath, css class name or whatever)? This would help me a lot.

Note that in your case a lot of Javascript gets executed including clicks on elements. So there is the potential risk that the framework that you are using might have a bug. There is clearly the WebDriver:ElementClick command that should be used instead.

Yes sure, the test always fails on the same code line: "WebDriver:FindElements",{"using":"xpath","value":"//content-stammdaten-card"}]

If the Firefox Window is in foreground, the test is green.

whimboo commented 1 year ago

Out of interest which kind of element is that? Is it a popup / overlay or one that gets lazily loaded via AJAX? Could you maybe create a minimized testcase to reproduce it?

Gerald94 commented 1 year ago

Out of interest which kind of element is that? Is it a popup / overlay or one that gets lazily loaded via AJAX? Could you maybe create a minimized testcase to reproduce it?

It is an normal div element, i tried to click other elements, but also got the same error, so it is hard to reproduce. For me it seems that the geckodriver loses the connection after some time when the browser is placed in the background.

whimboo commented 1 year ago

That's not the case, at least for the trace log that you uploaded. The element with the xpath of content-stammdaten-card is not found in the page and then your code retries the same step for about 2s before a screenshot is taken and the session be deleted. So the question is what happens when you extend this timeout to eg. 10s and after 5s making the Firefox window visible?

Gerald94 commented 1 year ago

That's not the case, at least for the trace log that you uploaded. The element with the xpath of content-stammdaten-card is not found in the page and then your code retries the same step for about 2s before a screenshot is taken and the session be deleted. So the question is what happens when you extend this timeout to eg. 10s and after 5s making the Firefox window visible?

So now i have set the timeout to 10s, the element is even not found, after 5s i open the firefox window, then he finds the element.

moesfeld commented 1 year ago

I can confirm this. I'm working on anonymizing the Trace Logs. But in the Logs it looks just like the element is simply not in the Viewport. As soon as Firefox is in the foreground (Really meaning Visual foreground in the window context) element can be scrolled into view and clicked.

moesfeld commented 1 year ago

This is the Trace Log with obscured window (Fail) TraceWindowObscured.txt This is the Trace Log with unobscured window (Success) TraceWindowNotObscured.txt

I will try to pinpoint the Version this behaviour was introduced. Might even be depending on the Firefox Version. I'm already using WebDriver:ElementClick. I wonder why Chrome is handling it without Issues

Edit: OS Version: Windows 10 Platform: Windows x64 Firefox: 104.0.2 Geckodriver 0.31.0 Selenium 4.4.0

whimboo commented 1 year ago

Please note that the last trace log with the window obscured shows a different failure. It's not that the NotesArticleIcon element cannot be found but it's not reachable for issuing a click:

1663831777969   webdriver::server   DEBUG   <- 400 Bad Request {"value":{"error":"element not interactable","message":"Element <div class=\"NotesArticleIcon js-initialized\"> could not be scrolled into view","stacktrace":"RemoteError@chrome://remote/content/shared/RemoteError.jsm:12:1\nWebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:192:5\nElementNotInteractableError@chrome://remote/content/shared/webdriver/Errors.jsm:302:5\nwebdriverClickElement@chrome://remote/content/marionette/interaction.js:156:11\ninteraction.clickElement@chrome://remote/content/marionette/interaction.js:125:11\nclickElement@chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:204:29\nreceiveMessage@chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:92:31\n"}}

Is that a regression? If yes, in which version of Firefox did this problem exactly start? I tried locally but I cannot reproduce it yet. So I would appreciate a minimized code and test example.

moesfeld commented 1 year ago

Hi, seems to be a regression. I'm working on a complete code sample but it's hard to pinpoint a certain combination. For now it seems to be codependent on geckodriver and Firefox version. I know hard facts are needed but I got a feeling that it's more up to Firefox then geckodriver. I'll get back on this after the weekend I guess.

whimboo commented 1 year ago

Thanks! It would be helpful to not only get the regression range for the release but maybe a check if the beta 1 release of that version is also affected. Can you reproduce your problem locally? If yes mozregression is a great help in combination with the -c argument. Then you can use the MOZREGRESSION_BINARY environment variable to specify the version of Firefox in your test.

moesfeld commented 1 year ago

It is locally reproducible. But I'm having a bit of trouble with mozregression. Seems env is only set when using automated --command argument. Since I have to do reproduction manually (overlapping window, test case always returning 0...) I'd rather be using the manual mode or even mozregression GUI. But I will figure it out eventually. Just takes a bit more Time.

whimboo commented 1 year ago

Wouldn't it help to add some delays in the test code so that the test doesn't start immediately? Would that give you enough time to prepare Firefox for this situation?

moesfeld commented 1 year ago

Figured it out how to automate it and got some first results. Nifty tool! I'll try to dig deeper:

9:02.06 INFO: Test command result: 0 (build is good) 9:02.06 INFO: Narrowed integration regression window from [f3da8920, 4dc0e294] (3 builds) to [a4739fdb, 4dc0e294] (2 builds) (~1 steps left) 9:02.06 INFO: No more integration revisions, bisection finished. 9:02.06 INFO: Last good revision: a4739fdb1d1cffb06d5971a6fb6901c1b3b41d26 9:02.06 INFO: First bad revision: 4dc0e294453b58a010a1a78504fdbd8488c57a37 9:02.06 INFO: Pushlog: https://hg.mozilla.org/releases/mozilla-release/pushloghtml?fromchange=a4739fdb1d1cffb06d5971a6fb6901c1b3b41d26&tochange=4dc0e294453b58a010a1a78504fdbd8488c57a37

whimboo commented 1 year ago

Hm, that's interesting. With a quick look it seems to be something that has been changed in Firefox 98. I'm curious to see a more fine-graned result given that there are a lot of commits left over to test for mozregression. Thanks for doing that!

moesfeld commented 1 year ago

I checked the Nightlies: 10:53.04 INFO: Test command result: 1 (build is bad) 10:53.04 INFO: Narrowed integration regression window from [adb05036, 442a7912] (3 builds) to [adb05036, a297bc48] (2 builds) (~1 steps left) 10:53.04 INFO: No more integration revisions, bisection finished. 10:53.04 INFO: Last good revision: adb050364e8646f1a9efa890a5aaa7d71e6c3c3b 10:53.04 INFO: First bad revision: a297bc48dfeb404fcb9ada705d89d67e81cfd7bb 10:53.04 INFO: Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=adb050364e8646f1a9efa890a5aaa7d71e6c3c3b&tochange=a297bc48dfeb404fcb9ada705d89d67e81cfd7bb

The only commit was Bug 1733423 But this is not inside the first bisection. Wierd! I wonder if I had false positives in the first run

whimboo commented 1 year ago

Bug 1733423 doesn't really make sense given that this is for Android only.

moesfeld commented 1 year ago

Thats what i thougt. But i checked the Logs and in the regression window there are some Builds that don't even respond to geckodriver at all. So some changes might have been skipped because they were considered to be "on the dark side" already. I will have look at that.

moesfeld commented 1 year ago

Sorry, but I think you can omit what I wrote. I have to rethink my testcase. There are to many factors in it that create false negatives. I will have another go when I sorted that out.

brycetham commented 1 year ago

I encountered this issue today and was wondering if there was any movement on this.

I can confirm that it works on Firefox 97.0.2 but not in Firefox 98.

Like OP said, it's fine if I bring it to the foreground or if I hover over the window in the taskbar.

ivorobkalo commented 1 year ago

Same problem here.

But I guess you guys dig in a wrong direction - seems the issue is with FF starter to throttle rendering of the backgroung windows/tabs. So it seems the issue is not about a geckodriver but in a way to tell FF we want to disable all the throttling.

We run about 8 instances of a FF simultaneously and face this issue regularly

moesfeld commented 1 year ago

Sorry for the long wait. I was on vacation for a couple of weeks. You might be onto something here. Firefox 96 introduced throttling of occluded windows. It might be as easy as setting a config parameter. I will check that.

ivorobkalo commented 1 year ago

@moesfeld I've tried different config settings but found no way to disable throttling. If you find those "magic" config setting - please let others know

moesfeld commented 1 year ago

@ivorobkalo I can confirm setting widget.windows.window_occlusion_tracking.enabled to false did the trick for me. Firefox 106.1 Geckodriver 0.31.0 Win 10 Selenium 4.4.0

brycetham commented 1 year ago

@ivorobkalo I can confirm setting widget.windows.window_occlusion_tracking.enabled to false did the trick for me.

Oh neat this works! Thanks for looking into it.

Gerald94 commented 1 year ago

@ivorobkalo I can confirm setting widget.windows.window_occlusion_tracking.enabled to false did the trick for me. Firefox 106.1 Geckodriver 0.31.0 Win 10 Selenium 4.4.0

Thank you very much, it also works for me now!

ivorobkalo commented 1 year ago

@ivorobkalo I can confirm setting widget.windows.window_occlusion_tracking.enabled to false did the trick for me. Firefox 106.1 Geckodriver 0.31.0 Win 10 Selenium 4.4.0

Thanks a lot!

moesfeld commented 1 year ago

@ilya-corp Can you also confirm?

dhumx commented 1 year ago

I have the exact same issue but even with the suggested setting configuration of @moesfeld the issue still occurs.

Update: Sorry, I think it works for me now, I changed widget.windows.window_occlusion_tracking.enabled key not widget.windows.window_occlusion_tracking.enabled to false. I did the latter and it now works on my end.

whimboo commented 1 year ago

Thank you all for the details. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1802473 for further investigation on our side. Until we have a solution the workaround as proposed here will be a good solution.

whimboo commented 1 year ago

We have a patch ready for this issue and would like to get some feedback from you if it is working as expected. Therefore please download a try build of Firefox Nightly that has this preference set. To download select the "Artifacts and Debugging Tools" tab in the lower pane of that page and then download target.zip. Just extract this build and it is ready for use - no need to install.

Thanks in advance!

whimboo commented 1 year ago

We landed our patch in Firefox Nightly and it will ride the train with the 109 release, which will soon be on beta.