Updates to Media Bias Fact Check labels

N2ITN commented 5 years ago

Status

Assigning to @meddulla

Issue

New bias labels have been added on mediabiasfactcheck.com that aren't in the scripts. https://mediabiasfactcheck.com/breitbart/ has added the label FAILED FACT CHECKS Right now there is no code to deal with this when adding the data to the label database, and there could be other tags that are new as well. These new labels should be incorporated into the current label scheme.

Background

The get_process_data/labels_MBFC.py script scrapes mediabiasfactcheck.com (MBFC) for news domains and their labels.

The get_process_data/join_source_lists.py script takes the labels and narrows them down to a smaller subset, then joins them with the labels from opensources.co. The result is around 3000 news domains and their associated labels.

Task List

1) Scrape MBFC, using labels_MBFC.py, see what tags aren't handled in join_source_lists.py. Labels that are currently accounted for need to be translated to the 17 final labels in line 86 of join_source_lists.py.

2) As we don't want new output labels, there will be some interpretation to translate and incorporate the new input labels into the existing scheme. For example, FAILED FACT CHECKS should probably translate to low (accuracy) in the translation on line 86.

Requirements:

Python3, mongoDB

Thanks

It's a pretty simple task, but much appreciated! Any pull request authors will have their names added to the upcoming contributors page on the site.

meddulla commented 5 years ago

Hi Zach :)

Do you still need help with this? If so, happy to pitch in!

N2ITN commented 5 years ago

@meddulla Yes, this is still useful!

Data4Democracy / are-you-fake-news