michaeljohnclancy / news_scraper

0 stars 0 forks source link

Failed parse management #8

Closed michaeljohnclancy closed 5 years ago

michaeljohnclancy commented 5 years ago

Need to come up with a better way of dealing with blacklisted urls. Most sources have standardised articles on their website, but there may be unique articles (e.g. An 'article' with just a video on the page) which don't get parsed properly by us.

We need a way of catching and logging the articles that fail parsing.