WayneRose-95 / Metacritic_Webscraper-

A side project on creating my own first web scraper.
GNU General Public License v3.0
0 stars 0 forks source link

Dealing with Timeout Exception #9

Closed WayneRose-95 closed 2 years ago

WayneRose-95 commented 2 years ago

Whilst running the scraper, the scraper can timeout occasionally, and show this error message.

Timeout Exception

A potential solution is either import TimeoutException from selenium.common.exceptions.

Pagination was also a suggestion on how to solve this.

Both answers need to be researched and tested to successfully debug this error.

WayneRose-95 commented 2 years ago

The timeout issue no longer occurs thanks to a simple try/except statement around the problematic line of code.

Git issue #9 new lines

However, this has opened up some new issues.

Most notably, some records are being missed:

Git Issue #9 Missing Records

And the last page is no longer being scraped

Git issue Page Switching Bug

For reference, here is the method, sample_scraper, which is responsible for combining the other methods together to scrape pages.

Git Issue #9 Code

WayneRose-95 commented 2 years ago

The timeout exception along with the page switching method have been dealt with, however this raises another issues with dealing with the missing values present within the dataset.