seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
5.17k stars 960 forks source link

Using uc option with proxy in multiprocessing repeats same ip #2223

Closed FranciscoPalomares closed 11 months ago

FranciscoPalomares commented 11 months ago

Related to https://github.com/seleniumbase/SeleniumBase/issues/2101

With multi_proxy=True, uc_subprocess=True

Same ip in 2 different drivers with uc Thanks

mdmintz commented 11 months ago

This is working for me with valid proxies:

from parameterized import parameterized
from seleniumbase import BaseCase
BaseCase.main(__name__, __file__, "-n3")

class ProxyTests(BaseCase):
    @parameterized.expand(
        [
            ["host1:port1"],
            ["host2:port2"],
            ["host3:port3"],
        ]
    )
    def test_multiple_proxies(self, proxy_string):
        self.get_new_driver(
            undetectable=True, proxy=proxy_string, multi_proxy=True
        )
        self.driver.get("https://browserleaks.com/webrtc")
        self.sleep(30)

multi_webrtc_checks

FranciscoPalomares commented 11 months ago

Script failed for me: argparse.ArgumentError: argument --browser: conflicting option string: --browser

Have you tried proxies with authentication? for me not works

In two previous versions it did work

Thanks

mdmintz commented 11 months ago

argparse.ArgumentError: argument --browser: conflicting option string: --browser means you installed another pytest plugin that is conflicting with seleniumbase. You'll need to uninstall the other plugin for seleniumbase to work. (Likely you have pytest-playwright installed, which has the same option.)

Authenticated proxies are working normally for me.

FranciscoPalomares commented 11 months ago

For me with undetectable=False works,

driver = Driver( proxy=proxy, headless=headless, multi_proxy=True)

with undetectable=True not works

driver = Driver( uc=True, proxy=proxy, headless=headless, multi_proxy=True, uc_subprocess=True)

mdmintz commented 11 months ago

What part isn't working? The full script is working for me.

FranciscoPalomares commented 11 months ago

Solved with pip install selenium base --upgrade --force

Sorry

FranciscoPalomares commented 11 months ago

image Problem again today, executed pip install selenium base --upgrade --force too

mdmintz commented 11 months ago

@FranciscoPalomares It looks like you didn't add in the proxy=PROXY_STRING option, such as in this example: https://github.com/seleniumbase/SeleniumBase/issues/2223#issuecomment-1784140559 It's working for me, so perhaps there's an issue with your code. I can't debug unless you show me some code and your pytest run command.

FranciscoPalomares commented 11 months ago

Error with pyinstaller, generates dir with proxy_extdir* . What is the purpose of this folder? How to delete ?

mdmintz commented 11 months ago

That's how authenticated proxy in Chrome works: https://stackoverflow.com/a/35293284/7058266 (If you don't want the temporary folder, don't use authenticated proxy.) More info about that and pyinstaller issues: https://github.com/seleniumbase/SeleniumBase/issues/2028#issuecomment-1692451570

FranciscoPalomares commented 11 months ago

It should give the possibility that when .quit() is done or the driver is deleted, it automatically deletes this temporary folder

mdmintz commented 11 months ago

That temporary proxy folder is deleted automatically at the end of a pytest run with SeleniumBase. Multiple tests and drivers could use it.

FranciscoPalomares commented 11 months ago

With this code, without pytest not delete the proxy folder:

`import concurrent from multiprocessing import freeze_support

from seleniumbase import Driver

proxies = [ "proxy:....", "proxy:....", "proxy:....", ]

def logic(proxy): driver = Driver(multi_proxy = True, undetected=True, proxy = proxy, uc_subprocess=True, ad_block_on=False)

driver.get("https://browserleaks.com/webrtc")
driver.sleep(30)

driver.quit()

if name == 'main': freeze_support() with concurrent.futures.ProcessPoolExecutor(max_workers =2) as executor: future_proc = {executor.submit(logic, proxy): proxy for proxy in proxies} for future in concurrent.futures.as_completed(future_proc): pass`

mdmintz commented 11 months ago

That's why you have to use multithreading with pytest. There could be multiple threads using the same proxy config folder. The threads can't talk to each other, so neither one knows when it's safe to delete the folder. However, with pytest managing the threads, pytest knows when all threads have completed, and can safely delete the temporary proxy config folder at the end of the tests. That's one of many reasons why multithreading is only supported via pytest (using pytest-xdist).