Open protonhs opened 5 years ago
In /Users/mac/PycharmProjects/scrapy/venv/lib/python3.7/site-packages/proxyscrape/scrapers.py
, change the below functions:
def _get_proxy_daily_proxies_parse_inner(element, type, source):
content = element.text
rows = content.replace('"', '').replace("'", '').split('\n')
proxies = set()
for row in rows:
row = row.strip()
if len(row) == 0:
continue
params = row.split(':')
params.extend([None, None, None, type, source])
proxies.add(Proxy(*params))
return proxies
def get_proxy_daily_http_proxies():
url = 'http://www.proxy-daily.com'
response = requests.get(url)
if not response.ok:
raise RequestNotOKError()
try:
soup = BeautifulSoup(response.content, 'html.parser')
content = soup.find('div', {'id': 'free-proxy-list'})
centers = content.find_all('div', {'class': 'centeredProxyList freeProxyStyle'})
return _get_proxy_daily_proxies_parse_inner(centers[0], 'http', 'proxy-daily-http')
except (AttributeError, KeyError):
raise InvalidHTMLError()
@sriramkumar1996 that didn't work for me. However, in the same file if you go to RESOURCE_TYPE_MAP and comment out 'proxy-daily-http' that allows the whole program to work again.
I had the same problem. After replacing functions, I have had another error. I'm not sure it will help you but in scrapers.py lib requests are not imported. import request works for me.
with change
In
/Users/mac/PycharmProjects/scrapy/venv/lib/python3.7/site-packages/proxyscrape/scrapers.py
, change the below functions:def _get_proxy_daily_proxies_parse_inner(element, type, source): content = element.text rows = content.replace('"', '').replace("'", '').split('\n') proxies = set() for row in rows: row = row.strip() if len(row) == 0: continue params = row.split(':') params.extend([None, None, None, type, source]) proxies.add(Proxy(*params)) return proxies
def get_proxy_daily_http_proxies(): url = 'http://www.proxy-daily.com' response = requests.get(url) if not response.ok: raise RequestNotOKError() try: soup = BeautifulSoup(response.content, 'html.parser') content = soup.find('div', {'id': 'free-proxy-list'}) centers = content.find_all('div', {'class': 'centeredProxyList freeProxyStyle'}) return _get_proxy_daily_proxies_parse_inner(centers[0], 'http', 'proxy-daily-http') except (AttributeError, KeyError): raise InvalidHTMLError()
this won't work, only comment out 'PROXY_POOL_ENABLED = True', then it has no meaning to use proxy pool. anyone had solves this issues?
I install this repo from pip install git+ and looking for what dependencies it installed, there was proxyscrape23 but in other project I had this repo work so I checked the dependencies in this other project and was installed proxyscrape
I changed proxyscrape23 to proxyscrape and now works
Hello, I'm encountering this error lately:
File "/Users/mac/PycharmProjects/scrapy/venv/lib/python3.7/site-packages/proxyscrape/scrapers.py", line 164, in get_proxy_daily_http_proxies return _get_proxy_daily_proxies_parse_inner(centers[0], 'http', 'proxy-daily-http') IndexError: list index out of range
any help please?