MatthewChatham / glassdoor-review-scraper

Scrape reviews from Glassdoor
BSD 2-Clause "Simplified" License
179 stars 252 forks source link

No such element: Unable to locate element: {"method":"css selector","selector":".next"} #33

Closed beliakon closed 4 years ago

beliakon commented 4 years ago

I was not able to understand this issue. Would like to help me?

DevTools listening on ws://127.0.0.1:49170/devtools/browser/91092bb7-fd9b-493e-814e-fece11203277 2019-11-22 12:57:28,971 INFO 423 :main.py(34780) - Scraping up to 15 reviews. 2019-11-22 12:57:28,993 INFO 361 :main.py(34780) - Signing in to blah@blah.com 2019-11-22 12:57:38,056 INFO 342 :main.py(34780) - Navigating to company reviews 2019-11-22 12:57:49,113 INFO 286 :main.py(34780) - Extracting reviews from page 1 2019-11-22 12:57:49,160 INFO 291 :main.py(34780) - Found 10 reviews on page 1 2019-11-22 12:57:49,518 INFO 297 :main.py(34780) - Scraped data for "Growth through challenge"(Mon Mar 11 2019 08:15:39 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:50,343 INFO 297 :main.py(34780) - Scraped data for "Coming to work here was the best decision ever."(Fri Aug 24 2018 20:34:04 GMT+0300 (Eastern European Summer Time)) 2019-11-22 12:57:50,955 INFO 297 :main.py(34780) - Scraped data for "I am a buisness phone banker"(Thu Nov 21 2019 07:36:02 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:51,531 INFO 297 :main.py(34780) - Scraped data for "Wells Fargo a fine place to work."(Wed Nov 20 2019 05:41:15 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:52,044 INFO 297 :main.py(34780) - Scraped data for "Premier banker"(Tue Nov 19 2019 22:03:04 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:52,559 INFO 297 :main.py(34780) - Scraped data for "Good Place"(Tue Nov 19 2019 10:37:48 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:53,082 INFO 297 :main.py(34780) - Scraped data for "Great Environment"(Tue Nov 19 2019 13:48:19 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:53,490 INFO 297 :main.py(34780) - Scraped data for "Amazing"(Tue Nov 19 2019 15:36:25 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:53,877 INFO 297 :main.py(34780) - Scraped data for "Wonderful Environment to Grow and Learn"(Mon Nov 18 2019 19:29:21 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:54,258 INFO 297 :main.py(34780) - Scraped data for "Very corporate but good job overall"(Mon Nov 18 2019 17:45:43 GMT+0200 (Eastern European Standard Time)) 2019-11-22 12:57:54,294 INFO 326 :main.py(34780) - Going to page 2 Traceback (most recent call last): File "main.py", line 465, in main() File "main.py", line 453, in main go_to_next_page() File "main.py", line 330, in go_to_next_page 'next').find_element_by_tag_name('a') File "C:\Users\E20008699\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py", line 398, in find_element_by_class_name return self.find_element(by=By.CLASS_NAME, value=name) File "C:\Users\E20008699\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py", line 659, in find_element {"using": by, "value": value})['value'] File "C:\Users\E20008699\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute return self._parent.execute(command, params) File "C:\Users\E20008699\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\E20008699\AppData\Local\Continuum\anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".next"} (Session info: headless chrome=78.0.3904.97)

beliakon commented 4 years ago

Hi all. I was able to fix it. Please use this as paging_control.

def more_pages():
paging_control = browser.find_element_by_css_selector('.eiReviews__EIReviewsPageContainerStyles__pagination.noTabover.mt')
next_ = paging_control.find_element_by_class_name('pagination__PaginationStyle__next')
try:
next_.find_element_by_tag_name('a')
return True
except selenium.common.exceptions.NoSuchElementException:
return False
def go_to_next_page():
      logger.info(f'Going to page {page[0] + 1}')
paging_control = browser.find_element_by_class_name('pagination__PaginationStyle__pagination')
next_ = paging_control.find_element_by_class_name(
'pagination__PaginationStyle__next').find_element_by_tag_name('a')
browser.get(next_.get_attribute('href'))
time.sleep(1)
page[0] = page[0] + 1`