Open simonb83 opened 7 years ago
We could also look at links in the text to see what other sources they cite.
Are you envisioning that this kind of score be hard coded or that there is also an element of learning from an analyst who verifies the sources?
I'm not really sure yet, but likely some sort of combination.
Probably initially some hard coded rules to generate a preliminary score that can then be verified by an analyst and updated if need be.
If the 'rules' include some sort of whitelist or blacklist for certain sources, then this could definitely be automatically updated as as analysts verify the sources.
Definitely later down the line with enough hand-reviewed articles, it would be interesting to try and apply ML and see what sort of features might help distinguish articles.
It would be good to be able to score an article for reliability in order to help analysts as they analyze and interpret the extracted data. In some cases, news sources may be government run, 'fake news' or have poor sources / track record, and so any data reported by and extracted from these sources should be identifiable as having potential issues.
On the front end, this could include a filter for analysts to use whereby they can select all articles, or those which a reliability score above a certain threshold.
Some thoughts for implementation include: