When want to scrap the reviews, the following error occur
(Goodreads Scraper) C:\Users\USER\goodreads-scraper>python get_reviews.py --book_ids_path book_id.txt --output_directory_path classic_book_reviews_newest --sort_order newest --browser firefox --format csv
Traceback (most recent call last):
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller\utils.py", line 175, in download_geckodriver
response = urllib.request.urlopen(url)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 569, in error
return self._call_chain(args)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 502, in _call_chain
result = func(args)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "get_reviews.py", line 313, in
main()
File "get_reviews.py", line 276, in main
geckodriver_autoinstaller.install()
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller__init__.py", line 15, in install
geckodriver_filepath = utils.download_geckodriver(cwd)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller\utils.py", line 179, in download_geckodriver
raise RuntimeError(f'Failed to download geckodriver archive: {url}')
RuntimeError: Failed to download geckodriver archive: https://github.com/mozilla/geckodriver/releases/download/v0.29.0/geckodriver-v0.29.0-win32.tar.gz
Hi! It looks like perhaps you haven't set up all the necessary dependencies. You can find these listed in the README. You can read more about driver requirements for Selenium here.
When want to scrap the reviews, the following error occur (Goodreads Scraper) C:\Users\USER\goodreads-scraper>python get_reviews.py --book_ids_path book_id.txt --output_directory_path classic_book_reviews_newest --sort_order newest --browser firefox --format csv Traceback (most recent call last): File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller\utils.py", line 175, in download_geckodriver response = urllib.request.urlopen(url) File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 531, in open response = meth(req, response) File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 640, in http_response response = self.parent.error( File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 569, in error return self._call_chain(args) File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 502, in _call_chain result = func(args) File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\urllib\request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "get_reviews.py", line 313, in
main()
File "get_reviews.py", line 276, in main
geckodriver_autoinstaller.install()
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller__init__.py", line 15, in install
geckodriver_filepath = utils.download_geckodriver(cwd)
File "C:\Users\USER\anaconda3\envs\Goodreads Scraper\lib\site-packages\geckodriver_autoinstaller\utils.py", line 179, in download_geckodriver
raise RuntimeError(f'Failed to download geckodriver archive: {url}')
RuntimeError: Failed to download geckodriver archive: https://github.com/mozilla/geckodriver/releases/download/v0.29.0/geckodriver-v0.29.0-win32.tar.gz