seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
4.45k stars 908 forks source link

When there are multiple tabs uc_open* methods change order of elements in driver.window_handles and changes driver.current_window_handle #2808

Closed tfiwm closed 1 month ago

tfiwm commented 1 month ago

Thanks for this amazing framework. I hope I am not interpreting anything incorrectly in my following analysis. When I call any of the uc_open* methods, e.g. uc_open(), it changes the order of tabs/windows in the driver.window_handles and also sets the very first tab as the driver.current_window_handle. As a result, URLs are opened in unexpected tabs. Change uc_open() to just open() and everything works as expected.

Please take a look at the following example.

from seleniumbase import SB
from seleniumbase import BaseCase

def do_it(action, *args):
    print(f"Action: {action.__name__}, Args: {args}")
    action(*args)
    for index,window in enumerate(sb.driver.window_handles):
        print(f"Index: {index}, Handle: {window}, {'ACTIVE' if window == sb.driver.current_window_handle else 'INACTIVE'}")

with SB(uc=True, incognito=False, test=False, chromium_arg="--disable-features=PrivacySandboxSettings4") as sbt:
    sb: BaseCase = sbt
    do_it(sb.uc_open, "https://www.facebook.com")
    do_it(sb.open_new_window,True)
    do_it(sb.uc_open,"https://www.google.com")
    do_it(sb.uc_open,"https://www.w3schools.com")
    do_it(sb.uc_open,"https://www.google.com")

The first call to open facebook.com works as expected and reports one window handle as shown below

Action: <lambda>, Args: ('https://www.facebook.com',)
Index: 0, Handle: B031D90B8F0E60C894BFA589628A2DB6, ACTIVE

Then I open another empty tab and that is also fine. Now I have two tabs B031D90B8F0E60C894BFA589628A2DB6 and C3C091193C1B3CB8058DA013FB26058B and the last tab at index 1 is the driver.current_window_handle.

Action: open_new_window, Args: (True,)
Index: 0, Handle: B031D90B8F0E60C894BFA589628A2DB6, INACTIVE
Index: 1, Handle: C3C091193C1B3CB8058DA013FB26058B, ACTIVE

But then, I call uc_open() to open google.com and it opens google.com in the second/last browser tab as expected. However, after the call, the last browser tab C3C091193C1B3CB8058DA013FB26058B moves to index 0 and it is not the current_window_handle. Instead, the first browser tab B031D90B8F0E60C894BFA589628A2DB6 becomes the current_window_handle and it is moved to index 1 for some reason.

Action: <lambda>, Args: ('https://www.google.com',)
Index: 0, Handle: C3C091193C1B3CB8058DA013FB26058B, INACTIVE
Index: 1, Handle: B031D90B8F0E60C894BFA589628A2DB6, ACTIVE

Due to the above change, when I try to open w3schools.com, it opens the URL in the current_window_handle, which turns out to be first browser tab B031D90B8F0E60C894BFA589628A2DB6. So, instead of opening w3schools.com in the second/last browser tab, the URL is opened in the first tab. Now, we have the list of tabs as shown below.

Action: <lambda>, Args: ('https://www.w3schools.com',)
Index: 0, Handle: C3C091193C1B3CB8058DA013FB26058B, INACTIVE
Index: 1, Handle: B031D90B8F0E60C894BFA589628A2DB6, ACTIVE

Then I opened google.com again and that changed the ordering of tabs in the driver.window_handles changes back to how it was immediately after opening the empty browser tab. Now, if I were to open another URL, it would open the URL in the second browser tab because second browser tab C3C091193C1B3CB8058DA013FB26058B is the current_window_handle.

Action: <lambda>, Args: ('https://www.google.com',)
Index: 0, Handle: B031D90B8F0E60C894BFA589628A2DB6, INACTIVE
Index: 1, Handle: C3C091193C1B3CB8058DA013FB26058B, ACTIVE

Platform: Windows 11 Home 23H2 Python 3.12.0 seleniumbase 4.27.0 Chrome Version 125.0.6422.113 (Official Build) (64-bit)

mdmintz commented 1 month ago

See https://github.com/seleniumbase/SeleniumBase/issues/2328 and https://github.com/seleniumbase/SeleniumBase/issues/2796#issuecomment-2124766188.

When UC Mode disconnects chromedriver from Chrome to avoid detection, all traces of Selenium are removed from the web browser, and along with that, tab ordering too. That's one of the reasons why using multiple tabs is not recommended with UC Mode.

Instead of using multiple tabs, use multiple drivers in UC Mode. Example:

from seleniumbase import SB

with SB(uc=True, test=True) as sb:
    url = "seleniumbase.io/demo_page"
    sb.driver.uc_open_with_reconnect(url)
    links = sb.get_unique_links()
    for link in links:
        driver2 = sb.get_new_driver(undetectable=True)
        driver2.uc_open_with_reconnect(link)
        print(driver2.title)
        sb.quit_extra_driver()