Closed raphapassini closed 5 years ago
There’s a pull requests implementing a similar improvement on scrapy-splash: https://github.com/scrapy-plugins/scrapy-splash/pull/214
I'd also add to ensure http://
instead of https://
.
I had some troubles setting crawlera url with https://
:sweat_smile:
Created https://github.com/scrapy-plugins/scrapy-crawlera/pull/81 to fix it cc @Gallaecio , @raphapassini , @hcoura, @denisgermano
If you inadvertently set you
CRAWLERA_URL
setting without the URL scheme like:CRAWLERA_URL = "proxy.crawlera.com:8010"
You'll receive a non-descriptive twisted exception when trying to crawl
http://
I think a good approach would be to identify the lack of the scheme on
CRAWLERA_URL
and throw a descriptive expection. This can be done atspider_open
signal we listen to onCrawleraMiddleware
.