SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
30.73k stars 8.19k forks source link

πŸ› Bug Report: find_elements_by_xpath failing to get all elements for a given xpath or get_attribute failing to get attributes for every element #7845

Closed jm-willy closed 4 years ago

jm-willy commented 4 years ago

πŸ› Bug Report

For a given xpath find_elements_by_xpath is only getting a fraction of the elements and neglecting others with the same xpath or get_attribute fails to get the attributes for a fraction of elements but not of others. Xpaths are correct and all elements including the missing ones or the ones whose attribute couldn't be retrieved are highlighted in devtools. The missing elements seems to be always the same but can't confirm and they're in normal latin script. Update: also fails for attribute "title" not just "href". Update: find_elements_by_css_selector is failing too but is getting more elements than find_elements_by_xpath. find_elements_by_class_name is failing too probably and other find_elements_by aren't getting every element.

To Reproduce

Detailed steps to reproduce the behavior:

  1. Go to Instagram
  2. Log in
  3. Click on your following
  4. Run the script to scroll and get your following, indeed it should get the following not once but several times because is redundant, nevertheless some accounts are missing
  5. The list will be incomplete(probably the same happens for followers since the xpath is the same)

Environment

OS: <-- Windows 7 --> Browser: <-- Chrome --> Browser version: <-- 78.0.3904.108 --> Browser Driver version: <-- ChromeDriver 78.0.3904.105 --> Language Bindings version: <-- Python 3.8.0 -->

diemol commented 4 years ago

@jm-willy Can you please provide an HTML so we can use it to reproduce the issue? Please understand that we cannot use sites like Instragram for this, we won't create users and so on... Also, a complete script will be useful for to troubleshoot this.

jm-willy commented 4 years ago

@diemol Thanks for replying! Just want to improve Selenium because I already found three alternative ways to get my followings. I think it doesn't go against Instagram's ToS, is not a scrapper and is not spam, just to know who doesn't follow you back. If you think the script could be used by scrappers/spammers or any malicious user just delete or let me fully delete it after the issue is closed. You don't have to create new Instagram users just use your own Insta account. Sorry but I cannot provide html anonymously, if you know how I will upload the html. I updated the script should work when the path and user_name strings are changed.

diemol commented 4 years ago

I see, but without an HTML or publicly accessible site to reproduce the issue things get harder for us, please understand that. On a side note, Instagram has an API to do exactly what you want to do, so perhaps Selenium is not the indicated tool for your task.

jm-willy commented 4 years ago

How can I download the full HTML correctly to upload it here?

diemol commented 4 years ago

I does not need to be the HTML from the actual site you are using, it just needs to be something we can use to reproduce the issue.

jm-willy commented 4 years ago

Yes but how could I get such HTML, maybe devtools > html > copy element? or right click > save as? I've never download HTML before

diemol commented 4 years ago

Yes, that is a good start, also a simple Google search on "how to download an HTML page" should help you. After downloading it, we would appreciate having the automated script that reproduces the issue using the downloaded HTML page.

jm-willy commented 4 years ago

@diemol Tried several ways to download the html, even specialized software and none worked.

diemol commented 4 years ago

I see, if there is no way to provide information to reproduce the issue, maybe you can join us at the Slack/IRC channel mentioned at https://selenium.dev/support/. Over there, other community members can help you build the data needed to reproduce the issue, or maybe find out that it is not a Selenium issue and it can be fixed in a different way.

I'll close this issue for now and feel free to open a new one when enough information is provided.

lock[bot] commented 4 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.