The scraper errors while trying to download http://www.bbc.co.uk/news/stories?print=true. This doesn't seem to prevent the rest of the scraping from completing.
$ less /tmp/newsdiffs_logging_errs [ruby-2.4.2p198]
2017-11-01 06:48:12.591:ERROR:Unknown exception when updating http://www.bbc.co.uk/news/stories
2017-11-01 06:48:12.592:ERROR:Traceback (most recent call last):
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/scraper.py", line 414, in update_versions
update_article(article)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/scraper.py", line 321, in update_article
parsed_article = load_article(article.url)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/scraper.py", line 306, in load_article
parsed_article = parser(url)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 117, in __init__
self.html = grab_url(self._printableurl())
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 45, in grab_url
return grab_url(url, max_depth-1, opener)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 45, in grab_url
return grab_url(url, max_depth-1, opener)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 45, in grab_url
return grab_url(url, max_depth-1, opener)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 45, in grab_url
return grab_url(url, max_depth-1, opener)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 45, in grab_url
return grab_url(url, max_depth-1, opener)
File "/Users/tech/code/newsdiffs/website/frontend/management/commands/parsers/baseparser.py", line 43, in grab_url
raise Exception('Too many attempts to download %s' % url)
Exception: Too many attempts to download http://www.bbc.co.uk/news/stories?print=true
The scraper errors while trying to download
http://www.bbc.co.uk/news/stories?print=true
. This doesn't seem to prevent the rest of the scraping from completing.