MatthewChatham / glassdoor-review-scraper

Scrape reviews from Glassdoor
BSD 2-Clause "Simplified" License
177 stars 252 forks source link

Use multiprocessing #1

Open MatthewChatham opened 6 years ago

MatthewChatham commented 6 years ago

The script can operate in parallel on each page of 10 reviews.

In order to do this effectively, I should determine a good number of workers (probably 2-6) and assign each worker some subset of the total pages of reviews. So we need to compute the total number of pages of reviews. I have confirmed that there is a clear mapping between page number and URL, so we can send each worker to the appropriate pages with ease.