TeamHG-Memex / scrapy-rotating-proxies

use multiple proxies with Scrapy
MIT License
738 stars 158 forks source link

retrying request after ban does not respect DOWNLOAD_DELAY and autothrottle #32

Open kakarukeys opened 4 years ago

kakarukeys commented 4 years ago

It seems the retry request is immediately fired with a different proxy. In the log below you can see 6 requests fired within 1 second.

2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: Proxy <http://51.158.68.133:8811> is DEAD
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Retrying <GET https://xx/> with another proxy (failed 1 times, max retries: 5)
2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: GOOD proxy became DEAD: <http://167.172.140.184:3128>
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Retrying <GET https://xx/> with another proxy (failed 2 times, max retries: 5)
2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: GOOD proxy became DEAD: <http://178.128.85.255:3128>
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Retrying <GET https://xx/> with another proxy (failed 3 times, max retries: 5)
2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: GOOD proxy became DEAD: <http://186.219.183.45:8080>
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Retrying <GET https://xx/> with another proxy (failed 4 times, max retries: 5)
2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: GOOD proxy became DEAD: <http://103.123.246.66:8080>
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Retrying <GET https://xx/> with another proxy (failed 5 times, max retries: 5)
2019-11-30 00:00:53 [rotating_proxies.expire] DEBUG: Proxy <http://201.48.61.1:3128> is DEAD
2019-11-30 00:00:53 [rotating_proxies.middlewares] DEBUG: Gave up retrying <GET https://xx/> (failed 6 times with different proxies)

I could not exactly see how this is happening after examining your code, but from the answer below, it seems instead of returning a new copy of the request with return you should use yield?

https://stackoverflow.com/questions/28640102/retrying-a-scrapy-request-even-when-receiving-a-200-status-code/28640118#28640118