osome-iu / hoaxy-backend

Backend component for Hoaxy, a tool to visualize the spread of claims and fact checking
http://hoaxy.iuni.iu.edu/
GNU General Public License v3.0
138 stars 44 forks source link

Article parser title and date #34

Closed filmenczer closed 5 years ago

filmenczer commented 5 years ago

We need to check why paper3k and mercury are failing in retrieving titles (and dates) when parsing articles such as https://www.snopes.com/fact-check/ocasio-cortez-price-is-right/

Then we will need to re-parse the articles that currently have null title (and date?) in the DB.

chathuriw commented 5 years ago

If the title is not returning from newspaper3k, parser will try mercury parser. After that, I updated the status_code in URL table to 40 for articles that have empty titles and reparse the articles.