k1995 / BaiduyunSpider

百度云网盘搜索引擎,包含爬虫 & 网站
1.15k stars 479 forks source link

爬虫代理ip,报错 #7

Open 90house opened 7 years ago

90house commented 7 years ago

我对代码进行了改造,使用了代理ip但是仍然报错:

uk:2518160999 error to fetch files,try again later

getShareLists errno:-55

代码如下: def getHtml(url,ref=None,reget=5): try: **proxies={'http': '222.194.14.130:808'} proxy_support = urllib2.ProxyHandler(proxies) opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)

定义Opener

    # urllib2.install_opener(opener)
    request = urllib2.Request(url)**
    request.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36')
    if ref:
        request.add_header('Referer',ref)
    page = urllib2.urlopen(request,timeout=10)
    html = page.read()
except:
    if reget>=1:
        #如果getHtml失败,则再次尝试5次
        print 'getHtml error,reget...%d'%(6-reget)
        time.sleep(2)
        return getHtml(url,ref,reget-1)
    else:
        print 'request url:'+url
        print 'failed to fetch html'
        exit()
else:
    return html