Open djytwy opened 6 years ago
You need to link squid to phantomjs and chromium, otherwise only normal HTTP requests can use proxy.
Thanks you answer! But I still have some question with squid:
1.Will it use every proxy in the settings?If yes,it's must be too slow.If a useful proxy have only three minutes life in too many useless proxy that will consume a lot of time. If peers.conf write only one proxy,the request speed is too much fast, but in this way squid is look like useless ?
3.How to use squid provide proxy pyspider with docker ?This is squid in my docker network: And this is my test_code:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# Created on 2018-08-21 09:39:07
# Project: test_proxy
from pyspider.libs.base_handler import *
import os
class Handler(BaseHandler):
crawl_config = {
'proxy':'172.17.0.11:6666'
}
@every(minutes=24 * 60)
def on_start(self):
self.crawl('https://www.917.com/',callback=self.detail_page)
@config(priority=2)
def detail_page(self, response):
return {
"url": response.url,
"title": response.doc('title').text(),
}
This is my operation step: First,I get a surviving proxy.Then I write it to peers.conf (in this time,peers.conf has only one proxy) and reload squid in container with squid.Every thing is ready, but the code is doesn't work well with the Timeout. Thanks again !!!:blush:
When I using pyspider with docker , I want to use proxy with pyspider. How can I use docker for squid to provide proxy to pyspider?
This is my docker-compose.yml:
Like pyspider-demo, scheduler is running without other components.
172.17.0.1 is docker network Gateway.
This is my squid.conf:
And my peers.conf: