SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
29.78k stars 8.02k forks source link

[🐛 Bug]: Selenium can get full html page source, but cannot get any element #14058

Closed softputer closed 3 weeks ago

softputer commented 1 month ago

What happened?

I use selenium to open one web page. Even though script sleep 10s to wait for the page to be loaded, and the page has fully loaded already. i can print the page source with all elements , but when i try to use selenium to find any element, there is : 、、、 selenium.common.exceptions.NoSuchElementException: 、、、 i cannot even find html or body element, but when i changed the website, the code works well ,i can get html or body element i am not sure it is because the web site has done something. Chrome: 125.0.6422.113 chrome driver: 125.0.6422.78

How can we reproduce the issue?

maybe not, because it is private web site in our company

Relevant log output

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*"}
  (Session info: chrome=125.0.6422.113); For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
Stacktrace:
0   chromedriver                        0x0000000102722510 chromedriver + 4302096
1   chromedriver                        0x000000010271ae58 chromedriver + 4271704
2   chromedriver                        0x000000010234c19c chromedriver + 278940
3   chromedriver                        0x000000010238e2c4 chromedriver + 549572
4   chromedriver                        0x00000001023c6c5c chromedriver + 781404
5   chromedriver                        0x0000000102383004 chromedriver + 503812
6   chromedriver                        0x00000001023839ec chromedriver + 506348
7   chromedriver                        0x00000001026ea558 chromedriver + 4072792
8   chromedriver                        0x00000001026ef004 chromedriver + 4091908
9   chromedriver                        0x00000001026d179c chromedriver + 3970972
10  chromedriver                        0x00000001026ef8ec chromedriver + 4094188
11  chromedriver                        0x00000001026c471c chromedriver + 3917596
12  chromedriver                        0x000000010270cb50 chromedriver + 4213584
13  chromedriver                        0x000000010270cccc chromedriver + 4213964
14  chromedriver                        0x000000010271aa50 chromedriver + 4270672
15  libsystem_pthread.dylib             0x0000000180c5a034 _pthread_start + 136
16  libsystem_pthread.dylib             0x0000000180c54e3c thread_start + 8

Operating System

macos

Selenium version

4.20.0

What are the browser(s) and version(s) where you see this issue?

Chrome: 125.0.6422.113

What are the browser driver(s) and version(s) where you see this issue?

chrome driver: 125.0.6422.78

Are you using Selenium Grid?

no

github-actions[bot] commented 1 month ago

@softputer, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

github-actions[bot] commented 1 month ago

Hi, @softputer. Please follow the issue template, we need more information to reproduce the issue.

Either a complete code snippet and URL/HTML (if more than one file is needed, provide a GitHub repo and instructions to run the code), the specific versions used, or a more detailed description to help us understand the issue.

Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.

Reply to this issue when all information is provided, thank you.

softputer commented 1 month ago

Hi, @softputer. Please follow the issue template, we need more information to reproduce the issue.

Either a complete code snippet and URL/HTML (if more than one file is needed, provide a GitHub repo and instructions to run the code), the specific versions used, or a more detailed description to help us understand the issue.

Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.

Reply to this issue when all information is provided, thank you. 、、、 print(driver.page_source) print(driver.current_url) print(driver.get_cookies) login_form = driver.find_element(By.XPATH, "/html") print(login_form) 、、、 for this code, i can get page source with all elements ,but i cannot get html element the log will be

DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=200 | data={"value":"https://103.115.79.123/default#/tenant/login"} | headers=HTTPHeaderDict({'Content-Length': '56', 'Content-Type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
DEBUG:selenium.webdriver.remote.remote_connection:Finished Request
DEBUG:selenium.webdriver.remote.remote_connection:POST http://localhost:58993/session/7a645f924f018ac4699b222d058445e4/element {'using': 'xpath', 'value': '/html'}
DEBUG:urllib3.connectionpool:http://localhost:58993 "POST /session/7a645f924f018ac4699b222d058445e4/element HTTP/1.1" 404 0
DEBUG:selenium.webdriver.remote.remote_connection:Remote response: status=404 | data={"value":{"error":"no such element","message":"no such element: Unable to locate element:
titusfortner commented 1 month ago

There's no bug in Selenium that would cause this issue, and I've not seen any other questions or issues like this that I can provide a suggestion.

Only thing I can think is if it is a timing problem. If you sleep or wait for that element to be present does it work?

If it's not that, then there's something weird about the page you're trying to access, and if you can't reproduce this on a page that isn't proprietary, we can't even offer suggestions about how to work with it.

softputer commented 3 weeks ago

it is a simple website, maybe we can zoom? then i can show u the simple code? the website is not accessable i can arrange a zoom meetings

diemol commented 3 weeks ago

Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.

softputer commented 3 weeks ago

sure, let me give some more info.

  1. The Web page is a login page, looks like below:

截屏2024-06-13 17 16 33

The two input is Account and Password

  1. The XPATH of account input element i want to locate is:

    /html/body/div[1]/div/div/div/div[1]/div[1]/div[2]/label/div/div/div/input
  2. When i save the page to local host, when i use selenium to open file:///***.html i can get the input and send keys

  3. But when i to selenium open the website address by chrome, i make selenium to sleep 300s, and i can find the account input element by chrome's element function, by selenium cannot locate it,

  4. The account and password input element id changes each time i access the website

  5. Below is the options i have added to help selenium to open chrom:

options.add_argument('--remote-debugging-port=9222')
options.add_experimental_option('excludeSwitches', ['enable-automation'])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("useAutomationExtension", False)
options.add_argument('--enable-logging')
options.add_argument('--disable-infobars')  # 禁用信息栏
options.add_argument('--disable-extensions')  # 禁用扩展
options.add_argument('--disable-popup-blocking')
options.add_argument('disable-javascript')
driver = webdriver.Chrome(options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")

so does the website server has run some javascript to prevent selenium to get the element?

diemol commented 3 weeks ago

OK, so please share the code snippet we can use to reproduce the issue. Narratives are ambiguous.

softputer commented 3 weeks ago

below is the full code: but i don't think u can access the website, there is whitelist. we cannot open to others, but i can zoom to show u or make a video for u

import os
import time
import json

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException

options = Options()
options.add_argument('--remote-debugging-port=9222')
options.add_experimental_option('excludeSwitches', ['enable-automation'])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("useAutomationExtension", False)
options.add_argument('--enable-logging')
options.add_argument('--disable-infobars')  # 禁用信息栏
options.add_argument('--disable-extensions')  # 禁用扩展
options.add_argument('--disable-popup-blocking')
options.add_argument('disable-javascript')
driver = webdriver.Chrome(options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")

url = 'https://103.115.79.123/default#/tenant/login'
file_path = "/tmp/result.json"

def save_result(result):
    with open(file_path, "w") as file:
        json.dump(result, file)

def main():
    driver.set_page_load_timeout(10)
    try:
        driver.get(url)
    except Exception as e:
        message = f"Failed to load page: {type(e).__name__} - {str(e)}"
[test.txt](https://github.com/user-attachments/files/15817982/test.txt)

        result = {
            "status": "failure",
            "message": message
        }
        save_result(result)

    details_btn = driver.find_element(By.ID, 'details-button')
    details_btn.click()
    link_element = driver.find_element(By.PARTIAL_LINK_TEXT, "103.115.79.123")
    link_element.click()

    try:
        # wait for logging page
        wait = WebDriverWait(driver, 300)
        wait.until(EC.presence_of_element_located((By.XPATH, "/html/body/div[1]/div/div/div/div[1]/div[1]/div[2]/label/div/div/div/input")))
    except Exception as e:
        message = f"Failed to load page: {type(e).__name__} - {str(e)}"
        result = {
            "status": "failure",
            "message": message
        }
        save_result(result)

    user_input = driver.find_element(By.XPATH, "/html/body/div[1]/div/div/div/div[1]/div[1]/div[2]/label/div/div/div/input")
    user_input.send_keys("aaaa")

    time.sleep(20)

if __name__ == "__main__":
    main()
titusfortner commented 3 weeks ago

so does the website server has run some javascript to prevent selenium to get the element?

This is possible

XPATH of account input element i want to locate is:

But the issue is that you can't access any element on the page, regardless of locator? What happens when you find_elements by css selector with "*" does it find anything?

Regardless, using absolute path XPATH is going to be a challenge to maintain. Try https://selectorshub.com/ or something similar to find a less brittle option.

What if you just use default options instead of adding a bunch of arguments?

softputer commented 3 weeks ago

I have changed the code a little bit, delete all the options and find elements:

    css_elements = driver.find_elements(By.CSS_SELECTOR, '*')
    print(css_elements)

the result is: [] the full code is:

import os
import time
import json

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException

options = Options()
driver = webdriver.Chrome(options=options)

url = 'https://103.115.79.123/default#/tenant/login'
file_path = "/tmp/result.json"

def save_result(result):
    with open(file_path, "w") as file:
        json.dump(result, file)

def main():
    driver.set_page_load_timeout(10)
    try:
        driver.get(url)
    except Exception as e:
        message = f"Failed to load page: {type(e).__name__} - {str(e)}"
        result = {
            "status": "failure",
            "message": message
        }
        save_result(result)

    details_btn = driver.find_element(By.ID, 'details-button')
    details_btn.click()
    link_element = driver.find_element(By.PARTIAL_LINK_TEXT, "103.115.79.123")
    link_element.click()
    time.sleep(30)
    css_elements = driver.find_elements(By.CSS_SELECTOR, '*')
    print(css_elements)

    try:
        # wait for logging page
        wait = WebDriverWait(driver, 30)
        wait.until(EC.presence_of_element_located((By.XPATH, "/html/body/div[1]/div/div/div/div[1]/div[1]/div[2]/label/div/div/div/input")))
    except Exception as e:
        message = f"Failed to load page: {type(e).__name__} - {str(e)}"
        result = {
            "status": "failure",
            "message": message
        }
        save_result(result)

    user_input = driver.find_element(By.XPATH, "/html/body/div[1]/div/div/div/div[1]/div[1]/div[2]/label/div/div/div/input")
    user_input.send_keys("aaaa")

    time.sleep(20)

if __name__ == "__main__":
    main()
titusfortner commented 3 weeks ago

Yeah, it has to be something odd about that page. You can get page source? Can you verify the HTML is properly formatted (should be online tools available). You can turn on logging so we can verify selenium is sending the right commands, but this sounds like chromedriver doesn't like something about that page.

Oh, does it work in Firefox? If it works in Firefox then we can label this a chromedriver bug.

softputer commented 3 weeks ago

i have tried to use firefox and it works, maybe i will use firefox first below is the page source

<html>
 <head></head>
 <body>
  <div id="q-app" data-v-app="">
   <div id="LoginPortal" class="window-height">
    <div class="genie-login full-height row items-center justify-center bg-grey-2">
     <div>
      <div class="login-content q-pa-lg bg-white">
       <h4 class="text-color-main q-mb-md q-mt-none text-center"><b id="logo-genie">sddd</b><span id="logo-analytics">Group</span><small id="imageVersion" class="block q-mt-xs">ANALYTICS-1.3.0-R</small></h4>
       <div class="q-item q-item-type row no-wrap items-start" role="listitem">
        <div class="q-item__section column q-item__section--main justify-center col-3 text-right field-label-color text-no-wrap">
         帐号
        </div>
        <div class="q-item__section cumn q-item__section--main justify-center col">
         <label class="q-field row no-wrap items-start q-field--standard q-input q-field--dense q-pb-none on-right" for="f_8f344d93-ed03-407a-8008-957a13764f0a">
          <!---->
          <div class="q-field__inner relative-position col self-stretch">
           <div class="q-field__control relative-position row no-wrap text-color-main" tabindex="-1">
            <div class="q-field__control-container col relative-position row no-wrap q-anchor--skip">
             <input class="q-field__native q-placeholder" tabindex="0" id="f_8f344d93-ed03-407a-8008-957a13764f0a" data-cy="account" type="text" />
             <!---->
            </div>
           </div>
           <!---->
          </div>
          <!----></label>
        </div>
       </div>
       <div class="q-item q-item-type row no-wrap items-start" role="listitem">
        <div class="q-item__section column q-item__section--main justify-center col-3 text-right field-label-color text-no-wrap">
         密码
        </div>
        <diclass="q-item__section column="" q-item__section--main="" justify-center="" col"="">
         <label class="q-field row no-wrap items-start q-field--standard q-input q-field--dense q-pb-none on-right" for="f_a4728ee7-d52a-4219-9cd3-087dd0aa8b9e">
          <!---->
          <div class="q-field__inner relative-position col self-stretch">
           <div class="q-field__control relative-position row no-wrap text-color-main" tabindex="-1">
            <div class="q-field__control-container col relative-position row no-wrap q-anchor--skip">
             <input class="q-field__native q-placeholder" tabindex="0" id="f_a4728ee7-d52a-4219-9cd3-087dd0aa8b9e" data-cy="password" type="password" />
             <!---->
            </div>
            <div class="q-field__append q-field__marginal row no-wrap items-center">
             <i class="q-icon notranslate material-icons cursor-pointer" aria-hidden="true" role="presentation">visibility_off</i>
            </div>
           </div>
           <!---->
          </div>
          <!----></label>
        </diclass="q-item__section>
       </div>
      </div>
      <!---->
      <button class="q-btn q-btn-item non-selectable no-outline q-btn--unelevated q-btn--rectangle q-btn--square bg-color-main text-white q-btn--actionable q-focusable q-hoverable q-btn--no-uppercase q-btn--square block q-mx-auto no-hover q-mt-md" tabindex="0" type="button" id="login-submit" data-cy="login"><span class="q-focus-helper"></span><span class="q-btn__content text-center col items-center q-anchor--skip justify-center row"><span class="block">登入</span></span>
       <!----></button>
     </div>
     <div class="q-py-sm q-px-md q-mt-lg text-center login-hint">
      <i class="q-icon text-negative notranslate material-ico q-mr-sm" aria-hidden="true" role="presentation">warning</i>
      <small> 
       <!----></small>
     </div>
    </div>
   </div>
  </div>
  <div id="q-notify" data-v-app="">
   <div class="q-notifications">
    <div class="q-notifications__list q-notifications__list--top fixed column no-wrap items-start"></div>
    <div class="q-notifications__list q-notifications__list--top fixed column no-wrap items-end"></div>
    <div class="q-notifications__list q-notifications__list--bottom fixed column no-wrap items-start"></div>
    <div class="q-notifications__list q-notifications__list--bottom fixed column no-wrap items-end"></div>
    <div class="q-notifications__list q-notifications__list--top fixed column no-wrap items-center"></div>
    <div class="q-notifications__list q-notifications__list--bottom fixed column no-wrap items-center"></div>
    <div class="q-notifications__list q-notifications__list--center fixed column no-wrap items-start justify-center"></div>
    <div class="q-notifications__list q-notifications__list--center fixed column no-wrap items-end justify-center"></div>
    <div class="q-notifications__list q-notifications__list--center fixed column no-wrap flex-center"></div>
   </div>
  </div> 
 </body>
</html>
github-actions[bot] commented 3 weeks ago

Hi, @softputer. This issue has been determined to require fixes in ChromeDriver.

You can see if the feature is passing in the Web Platform Tests.

If it is something new, please create an issue with the ChromeDriver team. Feel free to comment the issues that you raise back in this issue. Thank you.