seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
5.3k stars 973 forks source link

Using xvfb=True without Headless mode does not work with Docker #2982

Closed Orbiszeus closed 2 months ago

Orbiszeus commented 2 months ago

Dear Michael, I have seen other issues and read them as well, you have advised using xvfb=True for proxy reasons without the need of the Headless mode with the new update. However when I use here :

def menu_crawler(url, is_area):
    menu_items = []
    with SB(uc=True, xvfb=True, locale_code="tr") as sb:
        sb.driver.uc_open_with_reconnect(url, 20)       
        try:
            print("Locale Code: " +str(sb.get_locale_code()))
            print(sb.save_screenshot_to_logs(name=None, selector=None, by="css selector"))
            print("Page title: " + str(sb.get_title()))
            sb.sleep(5)
            # sb.uc_gui_handle_cf()
            sb.sleep(5)

It is not opening the page with Headless=True at least it was opening the webpage (less secure for my crawler I know).

Raised Error looks like :

start error

X11 display failed! Will use regular xvfb!

Warning: uc_driver not found. Getting it now:

*** chromedriver to download = 127.0.6533.88 (Latest Stable) 

Downloading chromedriver-linux64.zip from:

https://storage.googleapis.com/chrome-for-testing-public/127.0.6533.88/linux64/chromedriver-linux64.zip ...

Download Complete!

Extracting ['chromedriver'] from chromedriver-linux64.zip ...

Unzip Complete!
mdmintz commented 2 months ago

Duplicate of https://github.com/seleniumbase/SeleniumBase/issues/2976#issuecomment-2258996917

Also, you mentioned in https://github.com/seleniumbase/SeleniumBase/issues/2976#issue-2438063433 that you were using python:3.10-slim as your base Docker image. The slim builds are missing required libraries, such as the X11 display and configuration. Use ubuntu:22.04, as in the included Dockerfile: SeleniumBase/Dockerfile. It will run without that error you had, but as mentioned earlier, it probably won't bypass CAPTCHAs due to the Docker fingerprint that's detectable. Regular Linux doesn't leave that fingerprint, so if bypassing CAPTCHAs is important, avoid Docker.

Orbiszeus commented 2 months ago

Thank you for your fast response!!!

Orbiszeus commented 2 months ago

I tried your Dockerfile but it is not building up. I am getting this error:


#42 [40/49] COPY virtualenv_install.sh /SeleniumBase/virtualenv_install.sh

#42 ERROR: failed to calculate checksum of ref b93d1a10-4cbc-48fb-8b07-583372b45b8e::6bct4o0zghcwtb09rv5pokskt: "/virtualenv_install.sh": not found

#43 [37/49] COPY MANIFEST.in /SeleniumBase/MANIFEST.in

#43 ERROR: failed to calculate checksum of ref b93d1a10-4cbc-48fb-8b07-583372b45b8e::6bct4o0zghcwtb09rv5pokskt: "/MANIFEST.in": not found

#44 [38/49] COPY pytest.ini /SeleniumBase/pytest.ini

#44 ERROR: failed to calculate checksum of ref b93d1a10-4cbc-48fb-8b07-583372b45b8e::6bct4o0zghcwtb09rv5pokskt: "/pytest.ini": not found

#45 [ 1/49] FROM docker.io/library/ubuntu:22.04@sha256:340d9b015b194dc6e2a13938944e0d016e57b9679963fdeb9ce021daac430221

#45 resolve docker.io/library/ubuntu:22.04@sha256:340d9b015b194dc6e2a13938944e0d016e57b9679963fdeb9ce021daac430221 done

#45 CANCELED

-----

> [31/49] COPY sbase /SeleniumBase/sbase/:

-----

-----

> [32/49] COPY seleniumbase /SeleniumBase/seleniumbase/:

-----

-----

> [33/49] COPY examples /SeleniumBase/examples/:

-----

-----

> [34/49] COPY integrations /SeleniumBase/integrations/:

-----

-----

> [36/49] COPY setup.py /SeleniumBase/setup.py:

-----

-----

> [37/49] COPY MANIFEST.in /SeleniumBase/MANIFEST.in:

-----

-----

> [38/49] COPY pytest.ini /SeleniumBase/pytest.ini:

-----

-----

> [39/49] COPY setup.cfg /SeleniumBase/setup.cfg:

-----

-----

> [40/49] COPY virtualenv_install.sh /SeleniumBase/virtualenv_install.sh:

-----

Dockerfile:119

-------------------

117 |     COPY pytest.ini /SeleniumBase/pytest.ini

118 |     COPY setup.cfg /SeleniumBase/setup.cfg

119 | >>> COPY virtualenv_install.sh /SeleniumBase/virtualenv_install.sh

120 |     RUN find . -name '*.pyc' -delete

121 |     RUN pip install --upgrade pip setuptools wheel

-------------------

ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref b93d1a10-4cbc-48fb-8b07-583372b45b8e::6bct4o0zghcwtb09rv5pokskt: "/virtualenv_install.sh": not found
Orbiszeus commented 2 months ago

Also, before fully integrating your Dockerfile, I have changed mine from FROM Python to Ubuntu. However, I was still getting X11 display failed! Will use regular xvfb!

mdmintz commented 2 months ago

The file exists if you have the complete fork: SeleniumBase/virtualenv_install.sh

Check to make sure your system isn't hiding .sh files.

It's working from the included Dockerfile:

 => [36/49] COPY setup.py /SeleniumBase/setup.py                              0.0s
 => [37/49] COPY MANIFEST.in /SeleniumBase/MANIFEST.in                        0.0s
 => [38/49] COPY pytest.ini /SeleniumBase/pytest.ini                          0.0s
 => [39/49] COPY setup.cfg /SeleniumBase/setup.cfg                            0.0s
 => [40/49] COPY virtualenv_install.sh /SeleniumBase/virtualenv_install.sh    0.0s
 => [41/49] RUN find . -name '*.pyc' -delete                                  0.5s
 => [42/49] RUN pip install --upgrade pip setuptools wheel                    3.5s
 => [43/49] RUN cd /SeleniumBase && ls && pip install -r requirements.txt --  9.5s
 => [44/49] RUN cd /SeleniumBase && pip install .                             1.7s
 => [45/49] RUN pip install pyautogui                                        11.2s
 => [46/49] RUN seleniumbase get chromedriver --path                          1.8s
 => [47/49] COPY integrations/docker/docker-entrypoint.sh /                   0.0s
 => [48/49] COPY integrations/docker/run_docker_test_in_chrome.sh /           0.0s
 => [49/49] RUN chmod +x *.sh                                                 0.2s
 => exporting to image                                                        0.4s

The Xvfb issue is not occurring when running the examples from that Docker image.

Orbiszeus commented 2 months ago

But, wait I have my own project and my own Dockerfile, it would be a bit painful to fork, clone everything in Selenium inside my application which will be huge. Is there any way without cloning SeleniumBase to my dir?

Orbiszeus commented 2 months ago

What I mean is that when I pip install selenium-base, am I not getting all of it?

mdmintz commented 2 months ago

The SeleniumBase Dockerfile is meant to be used from a full clone of SeleniumBase. If you choose to use your own Dockerfile from your own SeleniumBase installation, you'll need to manually figure out what you need to change in order to make things work.

Orbiszeus commented 2 months ago

Yes sir thanks!!