scrapy / scrapely

A pure-python HTML screen-scraping library
1.86k stars 272 forks source link

Unable to pull in https #110

Open JohnMTrimbleIII opened 6 years ago

JohnMTrimbleIII commented 6 years ago

I'm trying to follow the into documentation. I changed the training url to be an https one and get the following.

Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.6/site-packages/scrapely/init.py", line 48, in train page = url_to_page(url, encoding) File "/usr/local/lib/python3.6/site-packages/scrapely/htmlpage.py", line 183, in url_to_page fh = urlopen(url) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen return opener.open(url, data, timeout) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 532, in open response = meth(req, response) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 642, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 570, in error return self._call_chain(args) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain result = func(args) File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 650, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 503: Service Unavailable

I'm using Python 3.6.5.

URL was:

https://www.amazon.com/Xbox-One-X-1TB-Console/dp/B074WPGYRF/ref=sr_1_3?s=videogames&ie=UTF8&qid=1524486645&sr=1-3&keywords=xbox%2Bone%2Bx&th=1

Thanks!