ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
9.56k stars 1.14k forks source link

Undetected Chrome Driver Idle for 2 mins and start runing again everytime #632

Open harika09 opened 2 years ago

harika09 commented 2 years ago

Hi, I just want to ask about the problem I encounter every time I run the driver and for multiple instances. I have 100 tasks and every time the driver runs it will load the page, login user information, and then quit using the driver. quit ()

It will run smoothly five to eight times, but after that it will idle for 2 to 3 minutes before running again. It will repeat the process, and every time it hits 5, or 8, it will idle again and again. My question is how to stop the idle time? I just want to run the task without the idle of 2 to 3 minutes.

Here is my sample code

import pickle
from multiprocessing import freeze_support
import undetected_chromedriver.v2 as uc
import time
import csv
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime
import colorama
from colorama import Fore, Back, Style
colorama.init(autoreset=True)
freeze_support()

time_out = 5000

def undetectedChrome():

        def titanLOgin(Email, Password):
            # options = webdriver.ChromeOptions()
            driver = uc.Chrome(
                            executable_path=r"../HRB 1.0/system/chromedriver.exe")
            try:
                def titan():

                    now = datetime.datetime.now()
                    try:
                        driver.get('https://titan22.com/account/login?return_url=%2Faccount')
                        print(now.strftime("%Y-%m-%d %H:%M:%S") + ' ' +
                                                '[[ Titan Account Gen ]] ' + Fore.GREEN +
                                                ' Adding user information....')

                        titanEmail = WebDriverWait(driver, time_out).until(EC.presence_of_element_located((By.ID, 'CustomerEmail'))).send_keys(Email)

                        titanPassword = WebDriverWait(driver, time_out).until(EC.presence_of_element_located((By.ID, 'CustomerPassword'))).send_keys(Password)

                        titanSignIn = WebDriverWait(driver, time_out).until(EC.element_to_be_clickable(
                                                    (By.XPATH, "//button[contains(., 'Sign In')]"))).click()

                        if(driver.find_elements(by=By.XPATH, value='//*[@id="customer_login"]/div[1]/ul/li')):
                            print(now.strftime("%Y-%m-%d %H:%M:%S") + ' ' +
                                                    '[[ Titan Account Gen ]] ' + Fore.GREEN +
                                                    ' Invalid Username  / Password....')
                            titanLOgin()

                        print(now.strftime("%Y-%m-%d %H:%M:%S") + ' ' +
                                                    '[[ Titan Account Gen ]] ' + Fore.GREEN +
                                                    ' Loggin Success....')
                    finally:
                        time.sleep(2)
                        driver.quit()

                titan()

            except:
                driver.quit()

        file = open("../Undetected Chrome Driver Test/titan.csv")
        reader = csv.reader(file, delimiter=',')
        next(reader)
        for row in reader:
            titanLOgin(row[0], row[1])
        print('All entries successfully entered!')

if __name__ == '__main__':
    undetectedChrome()

image

sebdelsol commented 2 years ago
harika09 commented 2 years ago
  • When your login fail your code is recursive, that's really not a good idea.
  • You should use a shorter timeout : yours is 5000 second which is more than an hour, so if the page load fails for whatever reason you're stuck with an idle driver that takes a lot of memory. Please use selenium.common.exceptions.TimeoutException instead.
  • You don't show how you launch your threads or processes so it's very hard to guess what you're doing wrong.

I'm just running a single thread. I will send the whole folder here. Please check. It will help me a lot.  Thank you You can see that it will run smoothly for the 1st to 4th task but after that it will stop for 2-5 mins and run the task again

LINK

henzycuong1 commented 2 years ago

I got the same problem

PePinodemrs commented 2 years ago

I think that the problem comes at the moment when it is necessary to leave the driver and it takes time, but also to launch it. The problem is present only when there are a lot of tasks and I think it's when it's going to download the info from google apis chrome, not the time of download but rather a blocking at google api level

harika09 commented 2 years ago

it happens after every 6 or 5 tasks. I don't know if the driver or when saving the google login session

Lanshuns commented 2 years ago

same here, im doing a while loop and it stops every 6/5 tasks

Lanshuns commented 2 years ago

@harika09, @henzycuong1 please if any of you guys fixed this issue let me know

harika09 commented 2 years ago

@harika09, @henzycuong1 please if any of you guys fixed this issue let me know

Nope still having this issue.

sebdelsol commented 2 years ago

Nobody provide minimum code to reproduce this so called issue, and those comments are moot. At the moment the only actual occurrence of this "bug" is this weird recursive mess you'll find above : it's surprising it even works.

Those kind of "bugs" could be bad configurations, huge Selenium timeouts, drivers not properly disposed, race conditions in multi-thread env, slow proxies, rate limitation or bot behaviors detection from site throwing your script into black holes to prevent naive scrapping attempts... you name it, there are so many things that can go wrong with a complex package like Selenium.

Please try to stick to best practices for bugs reporting, and maybe we'll find out there's something to be fixed that'll benefit the community.

PePinodemrs commented 2 years ago

it's just a simple code like uc.Chrome() but when i start 2-3 chrome in the same time it's okay but more they are stuck sometimes

sebdelsol commented 2 years ago

sure, but how do you "start" several Chromes at the same time ? Please show some code to reproduce the issue, I can't reproduce this on my side hence the suspicion you're doing something weird.

Lanshuns commented 2 years ago

Nobody provide minimum code to reproduce this so called issue, and those comments are moot. At the moment the only actual occurrence of this "bug" is this weird recursive mess you'll find above : it's surprising it even works.

Those kind of "bugs" could be bad configurations, huge Selenium timeouts, drivers not properly disposed, race conditions in multi-thread env, slow proxies, rate limitation or bot behaviors detection from site throwing your script into black holes to prevent naive scrapping attempts... you name it, there are so many things that can go wrong with a complex package like Selenium.

Please try to stick to best practices for bugs reporting, and maybe we'll find out there's something to be fixed that'll benefit the community.

what about @harika09's code? i've downloaded his code and set the timeout to 5 instead and still the same issue, he uses no proxies or anything just simple code as you can see

PePinodemrs commented 2 years ago

bien sûr, mais comment "démarrer" plusieurs Chromes en même temps ? Veuillez montrer du code pour reproduire le problème, je ne peux pas le reproduire de mon côté, d'où le soupçon que vous faites quelque chose de bizarre.

With a simple threading or multiprocessing

PePinodemrs commented 2 years ago

But the famous bug is present for versions 97 to 103 and not from 92 to 96

sebdelsol commented 2 years ago

bien sûr, mais comment "démarrer" plusieurs Chromes en même temps ? Veuillez montrer du code pour reproduire le problème, je ne peux pas le reproduire de mon côté, d'où le soupçon que vous faites quelque chose de bizarre.

With a simple threading or multiprocessing

Bref, sans code je ne peux pas faire grand chose, j'entends bien qu'il y a un potentiel problème, mais comment le résoudre avec aussi peu d'indices ?

harika09 commented 2 years ago

@sebdelsol Here is my sample code. I'm just loading the Google login page here. I have 13 tasks on my GOOGLEACCOUNT.csv and after 6 tasks, it will stop and re-run again after a few minutes.

from multiprocessing import Process, freeze_support
import undetected_chromedriver as uc
from selenium.webdriver.chrome.service import Service
import time
import csv
freeze_support()

def googleLogin(email, password):
    options = uc.ChromeOptions()
    driver = uc.Chrome(use_subprocess=True,service=Service("/chromedriver.exe"),options=options)
    driver.get('https://accounts.google.com/ServiceLogin/identifier?service=mail&passive=1209600&osid=1&continue=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&followup=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&emr=1&flowName=GlifWebSignIn&flowEntry=ServiceLogin')
    print('Loaded')
    time.sleep(1)
    driver.quit()

file = open("GOOGLEACCOUNT.csv")
reader = csv.reader(file, delimiter=',')
next(reader)

for data in reader:
    googleLogin(data[0], data[1])

if __name__ == "__main__":
    googleLogin() 
PePinodemrs commented 2 years ago

from threading import Thread from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By def name(): while True: options = ChromeOptions() options.add_argument('--load-extension=C:/adblock') driver = uc.Chrome(version_main=101, browser_executable_path = my_path, options = options) driver.get('https://www.ecoledirecte.com/login') wait = WebDriverWait(driver, 5) el = wait.until(expected_conditions.element_to_be_clickable((By.XPATH, '//*[@id="username"]'))) el.click()

   ..........other things but not interesting

   driver.close()
   driver.quit()

if name == 'main': for i in range(10): Thread(target=name, args=()).start()

My problem is that when I launch the 10 drivers in an infinite loop after three rounds of the loop so only three I start to have problems on half of the drivers but each time they are not all concerned in each round but only about half. And what happens is that the driver launches, it loads the page and it stops in front of the login where it must click and it waits for a long time (0 to a few minutes) before skipping it without making any error and moving on to the rest of the code and yet it works very well on the versions 92 to 96 but from 97 to 103 the problem is present

Lanshuns commented 2 years ago

@sebdelsol or please tell us what chrome version you use or your undetected-chromedriver configuration

sebdelsol commented 2 years ago

Always the last one, why bother ?

harika09 commented 2 years ago

Always the last one, why bother ?

Did you already check this sir? https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/632#issuecomment-1152825337

PePinodemrs commented 2 years ago

Always the last one, why bother ?

Did you already check this sir? #632 (comment)

i have a little problem like that it the time that undetected-chromedriver download the driver from the source i think and also while leaving the driver when there is a lot of task

PePinodemrs commented 2 years ago

Always the last one, why bother ? it's to have a lot of version to change user agent and not being detected because if i change manualy it's easier detected

sebdelsol commented 2 years ago

I've tried @PePinodemrs code...

Some notes:

import threading

from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

import undetected_chromedriver as uc

def job(running):
    while running.is_set():
        try:
            options = uc.ChromeOptions()
            driver = uc.Chrome(options=options)
            driver.get("https://www.ecoledirecte.com/login")
            wait = WebDriverWait(driver, 5)
            elt_xpath = '//*[@id="username"]'
            elt = wait.until(EC.element_to_be_clickable((By.XPATH, elt_xpath)))
            elt.click()

            # ..........other things but not interesting
            # AND probably where it's broken hence the "delays"

        except TimeoutException:
            print("something's broken")
        finally:
            driver.quit()

if __name__ == "__main__":
    running = threading.Event()
    threads = [threading.Thread(target=job, args=(running,)) for _ in range(10)]

    running.set()
    for thread in threads:
        thread.start()

    input("hit enter to quit this weird script")

    running.clear()
    for thread in threads:
        thread.join()
PePinodemrs commented 2 years ago

i have to quit and reload driver because i have to clear the cache

sebdelsol commented 2 years ago

Are you sure deleting cookies is not enough ? driver.delete_all_cookies()

PePinodemrs commented 2 years ago

i will retry but i think it's not

harika09 commented 2 years ago

Are you sure deleting cookies is not enough ? driver.delete_all_cookies()

tried this but still same problem

henzycuong1 commented 2 years ago

everyone here used multithread the uc on the hdd disk, didn't it?

sebdelsol commented 2 years ago

Now I've tried @harika09 code...

Some notes:

sebdelsol commented 2 years ago

everyone here used multithread the uc on the hdd disk, didn't it?

I don't understand what that means.

PePinodemrs commented 2 years ago

Are you sure deleting cookies is not enough ? driver.delete_all_cookies()

tried this but still same problem

yes but on wich website ?

harika09 commented 2 years ago

Now I've tried @harika09 code...

  • It works like a charm too, there's no "delays"
  • Your code is broken too... I had to fix it.
  • Please learn how to format code on Github, check here.

Some notes:

  • Don't use use_subprocess=True but protect your main entry point instead, please check here.
  • freeze_support() is supposed to be used behind your protected main entry point. Please check here. Anyway I don't see why you need it.
  • Why do you create a Selenium Service ? This won't be used since extra **kwargs are disposed when the chromedriver is initialized !
  • I don't understand why you need to quit and recreate a driver each iteration, a driver can easily be reused...

undetected chrome driver won't open if I don't add use_subprocess=True when compiled using Pyinstaller

I need to close and re-open the driver to save google account session using pickle

can you share your code with us so we can compare?

sebdelsol commented 2 years ago

undetected chrome driver won't open if I don't add use_subprocess=True when compiled using Pyinstaller

No you don't : please check here... I already gave you the link to the relevant documentation and you didn't bother to read it, did you ?

I need to close and re-open the driver to save google account session using pickle

It doesn't make any sense either.

can you share your code with us so we can compare?


import time
from multiprocessing import freeze_support

import undetected_chromedriver as uc

def google_login(): options = uc.ChromeOptions() driver = uc.Chrome(options=options) driver.get( "https://accounts.google.com/ServiceLogin/identifier?service=mail&passive=1209600&osid=1&continue=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&followup=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&emr=1&flowName=GlifWebSignIn&flowEntry=ServiceLogin" ) print("Loaded") time.sleep(1) driver.quit()

if name == "main": freezesupport() for in range(20): google_login()



**Please read the relevant Python documentations, read Selenium documentations, read this package documentation and the main classes' docstrings, then you'll see why nothing you claim makes any sense and why you should close this _non issue_.**
harika09 commented 2 years ago

undetected chrome driver won't open if I don't add use_subprocess=True when compiled using Pyinstaller

No you don't : please check here... I already gave you the link to the relevant documentation and you didn't bother to read it, did you ?

I need to close and re-open the driver to save google account session using pickle

It doesn't make any sense either.

can you share your code with us so we can compare?

import time
from multiprocessing import freeze_support

import undetected_chromedriver as uc

def google_login():
    options = uc.ChromeOptions()
    driver = uc.Chrome(options=options)
    driver.get(
        "https://accounts.google.com/ServiceLogin/identifier?service=mail&passive=1209600&osid=1&continue=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&followup=https%3A%2F%2Fmail.google.com%2Fmail%2Fu%2F0%2F&emr=1&flowName=GlifWebSignIn&flowEntry=ServiceLogin"
    )
    print("Loaded")
    time.sleep(1)
    driver.quit()

if __name__ == "__main__":
    freeze_support()
    for _ in range(20):
        google_login()

Please learn Python, read Selenium documentation, read this package documentation and the main classes' docstrings, then you'll see why nothing you claim makes any sense and why you should close this non issue.

Currently reading it also going test it tomorrow. Thank you sir.

I tried your code but still stop when hits task 6 and waits for 2-3 mins. got this message: selenium.common.exceptions.WebDriverException: Message: chrome not reachable (Session info: chrome=102.0.5005.115)

Lanshuns commented 2 years ago

Now I've tried @harika09 code...

  • It works like a charm too, there's no "delays"
  • Your code is broken too... I had to fix it.
  • Please learn how to format code on Github, check here.

Some notes:

  • Don't use use_subprocess=True but protect your main entry point instead, please check here.
  • freeze_support() is supposed to be used behind your protected main entry point. Please check here. Anyway I don't see why you need it.
  • Why do you create a Selenium Service ? This won't be used since extra **kwargs are disposed when the chromedriver is initialized !
  • I don't understand why you need to quit and recreate a driver each iteration, a driver can easily be reused...

hey sorry for bothering again, but what did you meant by driver can easily be reused i didn't really get it can you explain this point? thanks

sebdelsol commented 2 years ago

@Stainpy : you don't quit() it but get() another url, since quitting and recreating a driver is quite costly.

Lanshuns commented 2 years ago

@harika09 hey, try use driver.close() instead of driver.quit()

image

sebdelsol commented 2 years ago

@Stainpy : that's really not a good idea, if you only close your driver you'll end up with its process still running, you should quit it if you want everything to be cleaned up.

Lanshuns commented 2 years ago

@Stainpy : that's really not a good idea, if you only close your driver you'll end up with its process still running, you should quit it if you want everything to be cleaned up.

indeed, i use quit now only if unexpected error happened or after some tasks to clean up, i will use this as a temporary solution until i find out how to reuse the driver