hellock / icrawler

A multi-thread crawler framework with many builtin image crawlers provided.
http://icrawler.readthedocs.io/en/latest/
MIT License
857 stars 174 forks source link

TypeError: 'NoneType' object is not iterable #116

Closed Henry1887 closed 1 year ago

Henry1887 commented 1 year ago

This is the log output:

2023-07-12 20:33:57,386 - INFO - icrawler.crawler - start crawling... 2023-07-12 20:33:57,386 - INFO - icrawler.crawler - starting 1 feeder threads... 2023-07-12 20:33:57,388 - INFO - feeder - thread feeder-001 exit 2023-07-12 20:33:57,389 - INFO - icrawler.crawler - starting 1 parser threads... 2023-07-12 20:33:57,390 - INFO - icrawler.crawler - starting 1 downloader threads... 2023-07-12 20:33:57,752 - INFO - parser - parsing result page https://www.google.com/search?q=Overlord+Anime&ijn=0&start=0&tbs=&tbm=isch Exception in thread parser-001: Traceback (most recent call last): File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 1038, in _bootstrap_inner self.run() File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 975, in run self._target(*self._args, self._kwargs) File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\site-packages\icrawler\parser.py", line 94, in worker_exec for task in self.parse(response, kwargs): TypeError: 'NoneType' object is not iterable 2023-07-12 20:34:02,404 - INFO - downloader - no more download task for thread downloader-001 2023-07-12 20:34:02,404 - INFO - downloader - thread downloader-001 exit 2023-07-12 20:34:03,394 - INFO - icrawler.crawler - Crawling task done!

And this is my Code:

from icrawler.builtin import GoogleImageCrawler

google_crawler = GoogleImageCrawler(storage={'root_dir': 'output'}) google_crawler.crawl(keyword='Overlord Anime', max_num=10)

Im using the latest version (0.6.7)

ZhiyuanChen commented 1 year ago

Closing due to duplication of #107