MatthewChatham / glassdoor-review-scraper

Scrape reviews from Glassdoor
BSD 2-Clause "Simplified" License
179 stars 252 forks source link

No Such Element Exception #8

Open NKoenig06 opened 5 years ago

NKoenig06 commented 5 years ago

It's looking like there may have been element changes either in Selenium or on Glassdoor.

I'm not completely familiar with Selenium, so I was wondering if someone had seen this issue;

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"tag name","selector":"p"}

julsto93 commented 5 years ago

The issue seems to be that glassdoor throws another log-in at you. As a quick work-around I just set the sleep time to 40 seconds before this step and logged in manually, data extraction worked perfectly afterwards. Always happens after the line: browser.get(args.url)

NKoenig06 commented 5 years ago

Interesting, it's not breaking there for me. I'm always getting to the following point:

Then this error below. It seems to be saying it's not recognizing the method or selector. I made adjustments to the sleep time like you suggested and it didn't change me continuing to get stuck at the same spot despite trying multiple reviews pages.

Traceback (most recent call last): File "main.py", line 461, in main() File "main.py", line 441, in main reviews_df = extract_from_page() File "main.py", line 295, in extract_from_page data = extract_review(review) File "main.py", line 281, in extract_review res[field] = scrape(field, review, author) File "main.py", line 264, in scrape return fdictfield File "main.py", line 156, in scrape_years 'reviewBodyCell').find_element_by_tag_name('p') File "/home/nick/miniconda3/envs/scraping/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 305, in find_element_by_tag_name return self.find_element(by=By.TAG_NAME, value=name) File "/home/nick/miniconda3/envs/scraping/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 659, in find_element {"using": by, "value": value})['value'] File "/home/nick/miniconda3/envs/scraping/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute return self._parent.execute(command, params) File "/home/nick/miniconda3/envs/scraping/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/home/nick/miniconda3/envs/scraping/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"tag name","selector":"p"} (Session info: headless chrome=71.0.3578.98) (Driver info: chromedriver=2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5),platform=Linux 4.15.0-46-generic x86_64)

julsto93 commented 5 years ago

It could be that glassdoor only requests your login after you want to switch to the next page. But in my experience the important thing is, that you log in a second time before you start scratching. The first time referring to the scripted log-in.

MatthewChatham commented 5 years ago

Hey all, sorry I don't have time to look into this right now. But it sounds like you're finding you're way pretty well!

Without having done any investigation, my guess is that Glassdoor changed either the HTML structure of the site or their login flow, which is causing these errors. If we can diagnose the precise cause and get a PR for it, I'll approve it!

pyGideon commented 5 years ago

Hi NKoenig06, I was also facing this "selenium.common.exceptions.NoSuchElementException" error at times! In my experience with running this script you don't have to change sleep time(googling this error does show this type of solutions) or any part of the code instead just try after some time after closing everything out. For me it ran smoothly!!

pyGideon commented 5 years ago

Let me know if you want to scrap reviews for a specific company and will share with you..

MatthewChatham commented 5 years ago

Sounds like this has not been able to be reproduced? If so I'll close the issue.

tomjneal commented 5 years ago

I've experienced this same issue, but it's not always a problem. Doesn't seem to happen every time.

batordavid commented 5 years ago

Replacing some line of codes helped me.

Original (3 places in the codes): paging_control = browser.find_element_by_class_name('pagingControls') Updated: paging_control = browser.find_element_by_css_selector('.eiReviewsEIReviewsPageContainerStylespagination.noTabover.mt')

Original (2 places in the codes): next_ = paging_control.find_element_by_classname('next') Updated: next = paging_control.find_element_by_class_name('paginationPaginationStylenext')

Vineet-CSwiggy commented 5 years ago

It's looking like there may have been element changes either in Selenium or on Glassdoor.

I'm not completely familiar with Selenium, so I was wondering if someone had seen this issue;

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"tag name","selector":"p"}

same issue here.. tied above all, not working yet..

bartels50642 commented 3 years ago

Let me know if you want to scrap reviews for a specific company and will share with you..

Hi, would you be able to scrap reviews for a specific company for me?

NKoenig06 commented 3 years ago

Let me know if you want to scrap reviews for a specific company and will share with you..

Hi, would you be able to scrap reviews for a specific company for me?

If you look at the html layout of Glassdoor a lot has changed. I think this specific repo would need to be updated to accommodate the html changes to continue working.

bartels50642 commented 3 years ago

Let me know if you want to scrap reviews for a specific company and will share with you..

Hi, would you be able to scrap reviews for a specific company for me?

If you look at the html layout of Glassdoor a lot has changed. I think this specific repo would need to be updated to accommodate the html changes to continue working.

Hi Nick,

I have gotten pretty far with the code with some minor tweaks, but after I run the "main" function I get a "No Such Element Exception" that looks like this:

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".paginationPaginationStylepage.paginationPaginationStylecurrent"} (Session info: chrome=86.0.4240.111)

I don't suppose you'd know a work around for this?

JenelleMorgan42 commented 2 years ago

Hi @NKoenig06

I appreciate the tutorial you provided in your article posted online (https://nkoenig06.github.io/scrape-gd.html) on how to scrape online reviews. However, I am also encountering issues with the current repository that you linked. I think the specific issue is that Glassdoor keeps hitting with sign-up prompts, which the script is unable to address. This sounds similar to the issue you experienced previously, so I'm curious to know if you were able to resolve that?

If so, how?

Appreciate any help that you, or anyone else can provide!