Closed john-hu closed 2 years ago
We unlock the url when we got error. We should handle them one by one, like:
2021-12-27 23:47:03 [scrapy.extensions.logstats] INFO: Crawled 12 pages (at 2 pages/min), scraped 0 items (at 0 items/min) 2021-12-27 23:47:29 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.101cookbooks.com/archives/saffron-pasta-salad-recipe.html/055305273X> (referer: None) 2021-12-27 23:47:29 [peeler.scrapy_utils.spiders.base] ERROR: <twisted.python.failure.Failure scrapy.spidermiddlewares.httperror.HttpError: Ignoring non-200 response>
Please don't forget to increase the fetched_count while handling error.
We unlock the url when we got error. We should handle them one by one, like: