SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
30.42k stars 8.15k forks source link

[🐛 Bug]: (chromedriver.close Excpetion) Я хотел бы сказать что я ошибся чучуть. Ошибка произошла в момент выхода а не перезагрузки xpath... это было намеренное закрытие chromedriver через объект root cd (chrome driver), так как мой проект отработал. И! Браузер когда я пришёл к компьютеру был открыт но объект сайта не был получен, не знаю почему так произошло #13813

Closed Den4kCreator closed 5 months ago

Den4kCreator commented 5 months ago

What happened?

Приветствую вас! Я получил ошибку во время работы селениума. Мой root_chromedriver не изменял .close метод он работал от Chrome объекта Selenium. Мой проект перезапускает драйвер несколько раз так как почемуто хром драйвер не может найти xpath элементы если браузер скрыт (свёрнут), из за этого его приходиться разворачивать, я автоматизировал это просто перезапуская селениум. В логах последнее что было - действие для нахождения аттрибута в элементе olx страницы, для объявления который является сервисом (услугой, целью) для покупки, но его нужно проверить можно его купить или нет, аттрибут для этого и ищеться. Произошел перезапуск драйвера в попытке перезагрузить все xpath и я так понимаю выход произошел просто неудачно. (Конечно я бы мог сделать проще и воспользоваться вложенной функцией root_chromedriver (обёртка) чтобы перезапускать его но тут я об этом не подумал, но это сейчас не главное). (примечание: большинство данных я скрыл для конфиденциальности)

(Заранее хотел бы извинится за грязь в коде и некорректное написание, непроффесионализм точнее)

ad_manager.py (function)

def _wait_service_is_not_buyed(self, service_id: str) -> bool:
        ''' Refresh the site while service xpath 
        to have disabled property ("false" , also is_buy flag). (return status of not_buyed flag) (wait for service status to buy)
        OR return True if service id in the unique list of services
        :timeout: in sec
        :delay: in sec'''

        if service_id in [AdPromoteServicesNames.PUSHUP]:
            return True

        def get_service_status(service_id) -> bool:
            xpath_group = AdPromoteServicesXPATH.get_group(service_id)
            statuses = [e.get_attribute('disabled') for e in \
                              self.rootdriver._wait(timeout=25).until(EC.visibility_of_all_elements_located((By.XPATH, xpath_group))) \
                              if service_id in e.get_property('outerHTML')]
            if len(statuses) != 1:  # statuses it is status of "target service", 1 target service, using service_id
                raise ValueError('Find more service statuses')
            return bool(statuses[0])

        timeout = AdManagerWaitServiceIsNotBuyed.TIMEOUT
        delay = AdManagerWaitServiceIsNotBuyed.DELAY

        # check services which cannot to be inactive
        if service_id in [AdPromoteServicesNames.PUSHUP]:
            LOGGER.info('find service id in the list for one services (pushup e.t.c...)')
            return True

        # load services page
        promote_page = f"https://..." # hide
        if self.rootdriver.current_url != promote_page:
            LOGGER.info(f'curr page - {self.rootdriver.current_url} is not equal promote page - {promote_page}, get...')
            self.rootdriver.get(promote_page)

        # get service xpath using constants

        # Calculate max attemps for site refresh
        max_attempts = timeout // delay if timeout >= delay else 1
        error_attempts = 0

        LOGGER.info(f'define max attempts - {max_attempts}')

        for _ in range(max_attempts):
            try:
                LOGGER.info('try to get disabled attribute in the service xpath')
                is_buy = get_service_status(service_id)
                LOGGER.info(f'get is buy status, value - {is_buy}')
            except TimeoutException:
                LOGGER.error(f'timeout exception for attr include method')
                if error_attempts % 2 == 0:
                    self.rootdriver.restart_driver()
                if error_attempts > 6:
                    LOGGER.error(f'limit of error attempts for find service xpath')
                    raise TimeoutError ('Limit of error attempts')
                error_attempts += 1   # increment errors attempts
                self.rootdriver.refresh()
                continue
            else:
                error_attempts = 0
            if not is_buy:
                return True
            LOGGER.info(f'is buy has flag True, service is byed for now, sleep')
            sleep(delay)
            self.rootdriver.refresh()
        return False

Ошибка (из логов)

Traceback (most recent call last):
  File "D:\programming projects\python\olx scripts\excel\main.py", line 119, in <module>
    main()
  File "D:\programming projects\python\olx scripts\excel\main.py", line 79, in main
    with opb.olx_bot.OLXBotManager(rootdriver=rootdriver) as olx_bot_manager:
  File "D:\programming projects\python\olx scripts\excel\..\olx_promotions_bot\olx_bot.py", line 156, in __exit__
    self.exit()
  File "D:\programming projects\python\olx scripts\excel\..\olx_promotions_bot\olx_bot.py", line 148, in exit
    self.rootdriver.close()
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 459, in close
    self.execute(Command.CLOSE)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 346, in execute
    response = self.command_executor.execute(driver_command, params)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 293, in execute
    path = string.Template(path_string).substitute(params)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\string.py", line 114, in convert
    return str(mapping[named])
TypeError: 'NoneType' object is not subscriptable

КОД моего root chrome driver

import os
from datetime import datetime
# import threading
from time import sleep
import logging

from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import (NoSuchElementException, TimeoutException, StaleElementReferenceException)
from selenium.webdriver import Chrome as ChromeDriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.chrome.options import Options as ChromeOptions
# from selenium.webdriver.common.action_chains import ActionChains

ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# create dir for saved pages dir
SAVED_PAGES_DIR = os.path.join(ROOT_DIR, 'saved_pages')
os.makedirs(SAVED_PAGES_DIR, exist_ok=True)

LOGGER = logging.getLogger(__name__)

class RootChromeDriver(ChromeDriver):
    ''' object for settings a chrome driver '''
    def __init__(self, profile_name=None):
        self.profile_name = profile_name
        self.profile_path = None
        if self.profile_name:
            self.profile_path = os.getenv('LOCALAPPDATA') + f'\\Google\\Chrome\\User Data\\{self.profile_name}' 
        self.opts = ChromeOptions()
        self._set_my_config()

    # def _path_init(self):
        # self.screenshots_path = os.path.join(os.getcwd(), 'chromedriver_screenshots')
        # self.downloads_path = os.path.join(os.getcwd(), 'chromedriver_downloads')

    def click_to_action_btn(self, btn_xpath: str) -> bool:
        ''' use restart_wrapper for wait button and make click '''

        # wait and click
        LOGGER.info(f'Attempting to click action button | XPath: {btn_xpath} | Current URL: {self.current_url}')

        self._restart_wrapper('until', EC.element_to_be_clickable((By.XPATH, btn_xpath)), obj=self._wait(timeout=15)).click()

        # self._wait(timeout=15).until(
        #     EC.element_to_be_clickable(
        #         (By.XPATH, self.target_ad.xpath + btn_xpath))).click()
        LOGGER.info('Successfully clicked action button')
        return True

    def move_to_elem(self, elem, retry_limit: int=7):
        ''' scroll, move to elem and return him '''

        for _ in range(retry_limit):
            try:
                for _ in range(3):  # scroll to elem
                    self.execute_script("arguments[0].scrollIntoView();", elem)
                    sleep(0.5)
                LOGGER.info('scrolled to elem, check visibility...')
                # try to check visible elem
                self._wait(timeout=7).until(
                    EC.visibility_of(elem)
                )
            except (StaleElementReferenceException, NoSuchElementException):
                LOGGER.exception('Element has been changed on page, break...')
                return

            else:
                break
        else:
            raise TimeoutException('Error moving to elem')   # if we call this function from restart wrapper, we except this error

        LOGGER.info(f'success move to elem - {self.current_url}, elem_text - ' + elem.text.replace("\n", " | "))
        return elem

    def _wait(self, timeout: int=15):
        ''' WebDriverWait 
        :timeout: in sec'''
        return WebDriverWait(self, timeout=timeout)

    def _wait_presence_xpath_elem(self, xpath: str, timeout: int=20):
        ''' WebDriverWait, with EC.presence_of_element_located

        :timeout: in sec'''
        return self._wait(timeout=timeout).until(
                    EC.presence_of_element_located((By.XPATH, xpath)))

    def _restart_wrapper(self, func_name, *args, obj=None, call_retry: int=3, **kwargs):
        ''' 
        restart driver if function raised TimeoutException 
        | call TimeoutError if call_retry limit

        :call_retry: int, keyword arg, retry_limit to retry calling function
        '''

        # find function in the class object
        if obj is None:
            obj = self
        func = getattr(obj, func_name)

        if not func:
            raise AttributeError(f'Function - {func_name} is not found in RootChromeDriver object')

        # run function
        for attempt in range(1, call_retry+1):
            try:
                LOGGER.info(f'Attempt {attempt}/{call_retry}: Try to run function {func_name} with args={args}, kwargs={kwargs}')
                result = func(*args, **kwargs)
            except TimeoutException:
                LOGGER.info(f'TimeoutException occurred, Restarting driver...')
                self.restart_driver()
            else:
                LOGGER.info(f'Function {func_name} run successfully')
                return result
        LOGGER.error(f'TimeoutException occurred after {call_retry} attempts for function - {func_name} - args: {args}, kwargs: {kwargs}')
        raise TimeoutError(f'TimeoutException Limit for function - {func_name} - args: {args}, kwargs: {kwargs}')

    def _set_my_config(self) -> None:
        ''' adding the params to self.opts object '''
        # self.opts.add_argument("--headless=new")
        self.opts.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36")
        self.opts.add_argument("--disable-notifications")
        self.opts.add_argument("--no-sandbox")
        self.opts.add_argument('--ignore-certificate-errors')
        self.opts.add_argument("--disable-gpu")
        self.opts.add_argument("--disable-blink-features=AutomationControlled")
        self.opts.add_argument("--disable-extensions")
        self.opts.add_argument("--disable-popup-blocking")
        self.opts.add_argument("--disable-plugins-discovery")
        # self.opts.add_argument('--disable-application-cache')
        # self.opts.add_argument('--enable-logging')
        # self.opts.add_experimental_option('prefs', {'download.default_directory': self.downloads_path})
        self.opts.add_experimental_option("excludeSwitches", ["enable-logging"])
        self.opts.add_experimental_option('useAutomationExtension', False)

    def save_current_page(self, fname):
        # if os.path.exists(fname):
        # fdir = os.path.dirname(fname)
        # fname = os.path.basename(fname)
        # rename file, add current datetime for unique name
        new_fname = datetime.now().strftime('%Y-%m-%d %H-%M-%S-%f') + fname
        # rewrite variable to new path
        # fname = os.path.join(fdir, new_fname)
        html = self.page_source
        with open(os.path.join(SAVED_PAGES_DIR, new_fname), 'w', encoding='utf-8') as f:
            f.write(html)
        return new_fname

    def get_rootdriver(self):
        # set extensions

        # additional arguments
        if self.profile_path:
            if not os.path.exists(self.profile_path):
                print('[!]Warning profile path doesnt exists')
            self.opts.add_argument(f"--user-data-dir={self.profile_path}")

        self.opts.add_argument("--force-device-scale-factor=0.75")
        super().__init__(options=self.opts, service=ChromeService(ChromeDriverManager().install()))

        self.implicitly_wait(20)
        self.maximize_window()
        return self

    def close(self) -> None:
        ''' close handler for exceptions '''
        try:
            return super().close()
        except TypeError:
            pass

    def restart_driver(self):
        ''' save opened url and reload Driver Object '''
        LOGGER.info('Restarting driver...')
        last_opened_url = self.current_url
        self.exit()
        self = self.get_rootdriver()
        self.get(last_opened_url)

    def exit(self):
        self.close()
        self.quit()

    def __exit__(self, *exc):
        self.quit()

if __name__ == '__main__':
    with RootChromeDriver().get_rootdriver() as rootdriver:
        input('Exit...')

Olx bot Manager (Заранее хотел бы извинится за грязь в коде и некорректное написание, непроффесионализм точнее)

class OLXBotManager:
    def __init__(self, rootdriver, login='', password=''):
        self.rootdriver: RootChromeDriver = rootdriver
        # open olx
        self._login(login, password) 
        self._close_dialogue_window()  # close dialogue window on cabinet link
        self.site_objects_is_not_init = True

    def _close_dialogue_window(self):
        try:
            btn = self.rootdriver._wait_presence_xpath_elem(
                xpath=OlxCabinetXPATH.BTN_CLOSE_DIALOGUE)
        except TimeoutException:
            pass
        else:
            btn.click()

    def _filter_balance(self, text):
        try:
            balance = float(
                    ''.join(list(filter(
                        lambda s: s in ['.', ','] or s.isdigit(), text))
                    ).replace(',', '.')
                )
        except ValueError:
            balance = None
        return balance

    def _get_balance(self):
        ''' go to cabinet url and wait wallet balance elem '''

        # open cabinet url
        self.rootdriver.get(OlxURL.CABINET)

        for load_attempt in range(2):  # attempts to extract balance
            # get balance from front elem
            for balance_xpath in OlxCabinetXPATH.FRONT_WALLET_BALANCE_WRAPPERS:
                try:
                    elem_text = self.rootdriver._restart_wrapper('_wait_presence_xpath_elem', balance_xpath, timeout=10).text
                except TimeoutException:
                    balance = None
                else:
                    elem_text = elem_text.split('\n')[0]
                    balance = self._filter_balance(elem_text)
                    if balance:
                        break  # exit from loop

        return balance

    def _login(self, login='', password=''):
        ''' login in the user cabinet '''

        def wait_in_url(subpath: str):
            # func for check subpath of url
            flag = self.rootdriver._wait(timeout=60).until(
                EC.url_contains(subpath))
            return flag

        # Open URL
        self.rootdriver.get(OlxURL.CABINET)

        # If we in the account
        if wait_in_url(OlxURL.SUB_PATH_CABINET):
            return True

        # Login process

        # if the user has gived the acc data, insert them
        if login and password:
            try:
                # find login form
                login_input =  self.rootdriver._wait_presence_xpath_elem(
                    OlxLoginFormXPATH.LOGIN_INPUT)
                password_input = self.rootdriver._wait_presence_xpath_elem(
                    OlxLoginFormXPATH.PASSWORD_INPUT)
                btn_submit = self.rootdriver._wait_presence_xpath_elem(
                    OlxLoginFormXPATH.BTN_SUBMIT)
                # send keys
                login_input.send_keys(login)
                sleep(1)
                password_input.send_keys(password)
                sleep(1)
                btn_submit.click()
                sleep(1)
            except (TimeoutException, NoSuchElementException, StaleElementReferenceException) as ex:
                pass

        if wait_in_url('myaccount'):
            return True

        raise ValueError ('error login to olx')

    def _init_site_objects(self):
        ''' create main objects for control ads by user 

        :self.ad_managers: - ad managers with inventories
        :self.ad_managers['type'].inventory: - inventory of ad manager
        '''
        if self.site_objects_is_not_init:
            self.ad_manager = AdManager(self, self.rootdriver)
            self.site_objects_is_not_init = False

    def exit(self):
        self.rootdriver.close()
        self.rootdriver.quit()

    def __enter__(self):
        return self

    def __exit__(self, *exc):
        self.exit()

How can we reproduce the issue?

Хм. Я думаю пробовать вызывать перезапуск драйвера. Я прикреплю код файла root_chromedriver, часть кода где произошла ошибка я тоже предоставил вы можете поставить её на перезапуск каждый раз я думаю. Но если что вы можете написать мне я вам отвечу и предоставлю нужные данные. Я так же закреплю olx bot manager. (Заранее хотел бы извинится за грязь в коде и некорректное написание, непроффесионализм точнее)

Relevant log output

2024-04-12 08:20:59,757 | olx_models.ad_manager | ad_manager.py:274 | select_ad       | INFO     | curr target ad id - xxxxxxxx (hide) is not equal with finded ad id - xxxxxxxx (hide), set
2024-04-12 08:20:59,764 | olx_models.ad_manager | ad_manager.py:156 | _wait_service_is_not_buyed | INFO     | curr page - https://... is not equal promote page - https://... (hide) get...
2024-04-12 08:21:04,898 | olx_models.ad_manager | ad_manager.py:167 | _wait_service_is_not_buyed | INFO     | define max attempts - 24
2024-04-12 08:21:04,899 | olx_models.ad_manager | ad_manager.py:171 | _wait_service_is_not_buyed | INFO     | try to get disabled attribute in the service xpath
2024-04-12 08:21:45,943 | olx_models.ad_manager | ad_manager.py:175 | _wait_service_is_not_buyed | ERROR    | timeout exception for attr include method
2024-04-12 08:21:45,943 | root_chromedriver.root_chromedriver | root_chromedriver.py:170 | restart_driver  | INFO     | Restarting driver...
2024-04-12 08:21:48,020 | WDM      | logger.py:11 | log             | INFO     | ====== WebDriver manager ======
2024-04-12 08:21:48,795 | WDM      | logger.py:11 | log             | INFO     | Get LATEST chromedriver version for google-chrome
2024-04-12 08:21:49,427 | WDM      | logger.py:11 | log             | INFO     | Get LATEST chromedriver version for google-chrome
2024-04-12 08:21:50,019 | WDM      | logger.py:11 | log             | INFO     | Driver [C:\Users\Tkach\.wdm\drivers\chromedriver\win64\123.0.6312.122\chromedriver-win32/chromedriver.exe] found in cache
2024-04-12 12:21:00,246 | olx_models.ad_manager | ad_manager.py:274 | select_ad       | INFO     | curr target ad id - 341543703 is not equal with finded ad id - xxxxxxxx (hide), set
2024-04-12 14:26:00,161 | olx_models.ad_manager | ad_manager.py:274 | select_ad       | INFO     | curr target ad id - 313592143 is not equal with finded ad id - xxxxxxxx (hide), set
2024-04-12 14:26:00,162 | __main__ | main.py:121 | <module>        | ERROR    | main ex
Traceback (most recent call last):
  File "D:\programming projects\python\olx scripts\excel\main.py", line 119, in <module>
    main()
  File "D:\programming projects\python\olx scripts\excel\main.py", line 79, in main
    with opb.olx_bot.OLXBotManager(rootdriver=rootdriver) as olx_bot_manager:
  File "D:\programming projects\python\olx scripts\excel\..\olx_promotions_bot\olx_bot.py", line 156, in __exit__
    self.exit()
  File "D:\programming projects\python\olx scripts\excel\..\olx_promotions_bot\olx_bot.py", line 148, in exit
    self.rootdriver.close()
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 459, in close
    self.execute(Command.CLOSE)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 346, in execute
    response = self.command_executor.execute(driver_command, params)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 293, in execute
    path = string.Template(path_string).substitute(params)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
  File "C:\Users\Tkach\AppData\Local\Programs\Python\Python310\lib\string.py", line 114, in convert
    return str(mapping[named])
TypeError: 'NoneType' object is not subscriptable


### Operating System

Windows 11 Pro 23H2 (solved 22631.3447)

### Selenium version

Python Selenium 4.15.2

### What are the browser(s) and version(s) where you see this issue?

123.0.6312.122 (Официальная сборка) (64 бит) (cohort: M123 Rollout) 

### What are the browser driver(s) and version(s) where you see this issue?

Driver [...\.wdm\drivers\chromedriver\win64\123.0.6312.122\chromedriver-win32/chromedriver.exe]

### Are you using Selenium Grid?

No
github-actions[bot] commented 5 months ago

@Den4kCreator, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

diemol commented 5 months ago

Unfortunately, we cannot troubleshoot your code or identify the issue. If you are facing a problem, please provide a reduced test case that can be used to triage this issue.

github-actions[bot] commented 5 months ago

Hi, @Den4kCreator. Please follow the issue template, we need more information to reproduce the issue.

Either a complete code snippet and URL/HTML (if more than one file is needed, provide a GitHub repo and instructions to run the code), the specific versions used, or a more detailed description to help us understand the issue.

Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.

Reply to this issue when all information is provided, thank you.

diemol commented 5 months ago

I will close this because we didn't get any more information.

github-actions[bot] commented 4 months ago

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.