scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
51.16k stars 10.35k forks source link

Provide an addon for Broad Crawls #6331

Open kmike opened 2 weeks ago

kmike commented 2 weeks ago

There are common practices for broad crawls, explained here: https://docs.scrapy.org/en/latest/topics/broad-crawls.html. It involves modifying many settings. It seems we can provide a Scrapy addon to simplify all of that.

It can start as a third-party package, but I think Scrapy itself may also benefit from such addon.

wRAR commented 1 week ago

Are all of those settings safe to enable for all broad crawls?

kmike commented 1 week ago

Are all of those settings safe to enable for all broad crawls?

I'm not 100% sure. It seems we should review them. Also, we should ensure that the users can change the values we set in the addon.