I am trying to make it work, but sadly not much success so far.
After downloading the original files it shows this error after following Your description on usage:
File "scraper.py", line 89
SyntaxError: Non-ASCII character '\xf0' in file scraper.py on line 89, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
If started with python3 sometimes it works, but if I add new domains to the list it seems to crash on some.
/Downloads/news_scraper-master$ python3 scraper.py
Traceback (most recent call last):
File "scraper.py", line 42, in
web_content = requests.get(website)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, kwargs)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, kwargs)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/sessions.py", line 474, in request
prep = self.prepare_request(req)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/sessions.py", line 407, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/models.py", line 302, in prepare
self.prepare_url(url, params)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/models.py", line 382, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
If I'm not specifying python3
python scraper.py
Traceback (most recent call last):
File "scraper.py", line 45, in
web_content = requests.get(website)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, kwargs)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, kwargs)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/sessions.py", line 474, in request
prep = self.prepare_request(req)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/sessions.py", line 407, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/models.py", line 302, in prepare
self.prepare_url(url, params)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/models.py", line 382, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
Another error message on python3
Traceback (most recent call last):
File "scraper.py", line 57, in
'url': link.get('href').split('?')[0],
AttributeError: 'NoneType' object has no attribute 'split'
Could You please let me know if I'm missing something?
Dear Dror Ayalon!
First of all thank You for the app!
I am trying to make it work, but sadly not much success so far. After downloading the original files it shows this error after following Your description on usage:
File "scraper.py", line 89 SyntaxError: Non-ASCII character '\xf0' in file scraper.py on line 89, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
If started with python3 sometimes it works, but if I add new domains to the list it seems to crash on some.
/Downloads/news_scraper-master$ python3 scraper.py Traceback (most recent call last): File "scraper.py", line 42, in
web_content = requests.get(website)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, kwargs)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, kwargs)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/sessions.py", line 474, in request
prep = self.prepare_request(req)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/sessions.py", line 407, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/models.py", line 302, in prepare
self.prepare_url(url, params)
File "/home/trinidad/.local/lib/python3.6/site-packages/requests/models.py", line 382, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
If I'm not specifying python3
python scraper.py Traceback (most recent call last): File "scraper.py", line 45, in
web_content = requests.get(website)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/api.py", line 70, in get
return request('get', url, params=params, kwargs)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/api.py", line 56, in request
return session.request(method=method, url=url, kwargs)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/sessions.py", line 474, in request
prep = self.prepare_request(req)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/sessions.py", line 407, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/models.py", line 302, in prepare
self.prepare_url(url, params)
File "/home/trinidad/Downloads/news_scraper-master/env/lib/python2.7/site-packages/requests/models.py", line 382, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?
Another error message on python3
Traceback (most recent call last): File "scraper.py", line 57, in
'url': link.get('href').split('?')[0],
AttributeError: 'NoneType' object has no attribute 'split'
Could You please let me know if I'm missing something?
Thank You! Greg