NikolaiT / GoogleScraper

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
https://scrapeulous.com/
Apache License 2.0
2.64k stars 743 forks source link

TypeError in after_parsing() #115

Open neuegram opened 9 years ago

neuegram commented 9 years ago
File "C:\Python34\lib\site-packages\GoogleScraper\core.py", line 358, in main
    scrape_jobs = parse_all_cached_files(scrape_jobs, session, scraper_search)
  File "C:\Python34\lib\site-packages\GoogleScraper\caching.py", line 413, in pa
rse_all_cached_files
    serp = parse_again(fname, job['search_engine'], job['scrape_method'], job['q
uery'])
  File "C:\Python34\lib\site-packages\GoogleScraper\caching.py", line 443, in pa
rse_again
    query=query
  File "C:\Python34\lib\site-packages\GoogleScraper\parsing.py", line 1004, in p
arse_serp
    parser.parse(html)
  File "C:\Python34\lib\site-packages\GoogleScraper\parsing.py", line 126, in pa
rse
    self.after_parsing()
  File "C:\Python34\lib\site-packages\GoogleScraper\parsing.py", line 439, in af
ter_parsing
    if 'No results found for' in self.html or 'did not match any documents' in s
elf.html:
TypeError: 'str' does not support the buffer interface
neuegram commented 9 years ago

Issue can be resolved by making the following changes to line 439 in parsing.py:

# Replaced This Code: if 'No results found for' in self.html or 'did not match any documents' in self.html:
if 'No results found for' in str(self.html) or 'did not match any documents' in str(self.html): # Working Code
                self.no_results = True