Open GoogleCodeExporter opened 8 years ago
I'm sorry, but this is not currently possible. A good bit of work will need to
be
done to rebuild how rssdler fetches urls to build in proxy support. You MAY be
able
to pull off a preScanFunction that replaces mechRetrievePage with an appropriate
class. The function and class would look something like this, but this is
completely
untested and may not work at all (hopefully the formatting is not killed):
def set_proxy_browser():
global mechRetrievePage
mechRetrievePage = ProxyBrowser(proxy_url='blah.com')
class ProxyBrowser(object):
def __init__(self, UA=(('User-agent', _USER_AGENT),), proxy_user=None,
proxy_pass=None,
proxy_url=None, proxy_type='http', proxy_port):
br = mechanize.Browser()
br.set_robots_text(False)
if UA:
br.addheaders = UA
if proxy_url:
if proxy_port:
proxy_url = '%s:%s' % (proxy_url, proxy_port)
if proxy_user and proxy_pass:
proxy_url = '%s:%s@$s' % (proxy_user, proxy_pass, proxy_url)
br.set_proxies({proxy_type:proxy_url})
self.br = br
def __call__(self, url, th=None):
if th:
self.br.addheaders = th
return self.br.open(url)
Original comment by lostnihi...@gmail.com
on 1 Oct 2009 at 1:03
err, set_robots_text should be set_handle_robots
Original comment by lostnihi...@gmail.com
on 1 Oct 2009 at 1:07
Original issue reported on code.google.com by
cutler.scott
on 16 Sep 2009 at 6:32