Open whalebot-helmsman opened 3 years ago
I was setuping autoextract in scrapy cloud on a project with crawlera addon. Autoextract queries were routed through crawlera. Idea is to blacklist autoextract domain by default. It may have sense for other services, e.g. spalsh.
It is possible to implement this without adding new options, e.g. adding something to https://github.com/scrapy-plugins/scrapy-crawlera/blob/019987f68345079db176405c9f9fbb155ee26f20/scrapy_crawlera/middleware.py#L32
I would also log a warning for the first time it happens during a crawl.
I was setuping autoextract in scrapy cloud on a project with crawlera addon. Autoextract queries were routed through crawlera. Idea is to blacklist autoextract domain by default. It may have sense for other services, e.g. spalsh.
It is possible to implement this without adding new options, e.g. adding something to https://github.com/scrapy-plugins/scrapy-crawlera/blob/019987f68345079db176405c9f9fbb155ee26f20/scrapy_crawlera/middleware.py#L32