New bias labels have been added on mediabiasfactcheck.com that aren't in the scripts.
https://mediabiasfactcheck.com/breitbart/ has added the label FAILED FACT CHECKS
Right now there is no code to deal with this when adding the data to the label database, and there could be other tags that are new as well.
These new labels should be incorporated into the current label scheme.
Background
The get_process_data/labels_MBFC.py script scrapes mediabiasfactcheck.com (MBFC) for news domains and their labels.
The get_process_data/join_source_lists.py script takes the labels and narrows them down to a smaller subset, then joins them with the labels from opensources.co. The result is around 3000 news domains and their associated labels.
Task List
1) Scrape MBFC, using labels_MBFC.py, see what tags aren't handled in join_source_lists.py. Labels that are currently accounted for need to be translated to the 17 final labels in line 86 of join_source_lists.py.
2) As we don't want new output labels, there will be some interpretation to translate and incorporate the new input labels into the existing scheme. For example, FAILED FACT CHECKS should probably translate to low (accuracy) in the translation on line 86.
Requirements:
Python3, mongoDB
Thanks
It's a pretty simple task, but much appreciated! Any pull request authors will have their names added to the upcoming contributors page on the site.
Status
Assigning to @meddulla
Issue
New bias labels have been added on mediabiasfactcheck.com that aren't in the scripts. https://mediabiasfactcheck.com/breitbart/ has added the label
FAILED FACT CHECKS
Right now there is no code to deal with this when adding the data to the label database, and there could be other tags that are new as well. These new labels should be incorporated into the current label scheme.Background
The
get_process_data/labels_MBFC.py
script scrapes mediabiasfactcheck.com (MBFC) for news domains and their labels.The
get_process_data/join_source_lists.py
script takes the labels and narrows them down to a smaller subset, then joins them with the labels from opensources.co. The result is around 3000 news domains and their associated labels.Task List
1) Scrape MBFC, using
labels_MBFC.py
, see what tags aren't handled injoin_source_lists.py
. Labels that are currently accounted for need to be translated to the 17 final labels in line 86 ofjoin_source_lists.py
.2) As we don't want new output labels, there will be some interpretation to translate and incorporate the new input labels into the existing scheme. For example,
FAILED FACT CHECKS
should probably translate tolow
(accuracy) in the translation on line 86.Requirements:
Python3, mongoDB
Thanks
It's a pretty simple task, but much appreciated! Any pull request authors will have their names added to the upcoming contributors page on the site.