add timeout to urlopen. For some reason urlopen may block for a while.

istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

http://scrapy-cluster.readthedocs.io/

MIT License

1.18k stars 323 forks source link

add timeout to urlopen. For some reason urlopen may block for a while. #216

Closed mooosu closed 3 years ago

mooosu commented 5 years ago

It is often blocked for quiet a while that requests are from China to 'http://ip.42.pl/raw'. When update_ipaddress was blocked the spider hangs.

coveralls commented 5 years ago

Coverage decreased (-0.02%) to 65.728% when pulling 909c6ec98f5d0f1617b358ce788712dfcbf635f8 on mooosu:master into 13aaed2349af5d792d6bcbfcadc5563158aeb599 on istresearch:master.

madisonb commented 5 years ago

Please scrap these changes and remake them against the dev branch. Also please add documentation for IP_ADDR_REQUEST_TIMEOUT setting (and what it does) here

andrewkcarter commented 4 years ago

bump

madisonb commented 3 years ago

Closing due to inactivity.