seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
5k stars 942 forks source link

When proxy is used the headless=True does not work #2969

Closed vinifr closed 1 month ago

vinifr commented 1 month ago

Hello. I am trying to run in headless mode but it does not work when I use proxy in argument list.

OS: Debian 11 python 3.9 selenium Version: 4.20.0 undetected-chromedriver Version: 3.5.5 seleniumbase Version: 4.28.5

import undetected_chromedriver as uc
from seleniumbase import Driver
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support import expected_conditions as EC

proxyStr = f"{proxy_host}:{proxy_port}"

try:
    driver = Driver(uc=True, headless=True, log_cdp=True, proxy=proxyStr)
except Exception as e:
    print("driver error")
    print(e)
    return

If I use "Driver(uc=True, headless=True, log_cdp=True)" instead the headless mode works.

mdmintz commented 1 month ago

There are several issues with your code.

  1. If you install a specific version of seleniumbase, then you must use the dependency versions that are set in the requirements / setup.py. Eg: https://github.com/seleniumbase/SeleniumBase/blob/v4.28.5/setup.py#L185 (You used seleniumbase 4.28.5, which required selenium 4.22.0, but instead you overwrote that with 4.20.0.)
  2. Things are updated all the time, so make sure you only submit issues if you can reproduce them on the latest version of seleniumbase, which is currently 4.29.3.
  3. seleniumbase has its own fork of undetected-chromedriver, which means that you should not be installing undetected-chromedriver separately, as that may override the version that comes with seleniumbase.
  4. When running on Linux, use the SB() format instead of the Driver() format so that you get the special virtual display that you need for UC Mode.
  5. headless mode isn't supported in UC Mode anymore because UC Mode uses pyautogui for a lot of things, and that doesn't work with a headless browser. (You don't need headless mode on Linux anymore because of the special virtual display.)

In the future, show the full stack trace so that I can debug more easily. Saying "it does not work" isn't always as helpful by itself as it sounds.

vinifr commented 1 month ago

What I mean by "does not work" is that the browser is openning the graphical interface when I included proxy option. But if I dont include the proxy option, the browser run in headless mode how it should be.

mdmintz commented 1 month ago

See all the list items in https://github.com/seleniumbase/SeleniumBase/issues/2969#issuecomment-2254152049.

vinifr commented 1 month ago

Alright. Just one question. Can I run headless mode and UC mode with SB() ?

Because I dont have GUI support in my VM and I do not plan to install it.

mdmintz commented 1 month ago

The machine does not need a GUI due to the Xvfb virtual display. Use SB(uc=True, xvfb=True). Eg:

from seleniumbase import SB

with SB(uc=True, xvfb=True) as sb:
    url = "https://gitlab.com/users/sign_in"
    sb.uc_open_with_reconnect(url, 4)
    sb.uc_gui_click_captcha()
    sb.assert_text("Username", '[for="user_login"]', timeout=3)
    sb.assert_element('label[for="user_login"]')
    sb.highlight('button:contains("Sign in")')
    sb.highlight('h1:contains("GitLab.com")')
    sb.post_message("SeleniumBase wasn't detected", duration=4)