Is Your Favorite News Source Biased? Using Machine Learning to Identify Bias in Media

[ ID ] a85619a4-393f-4033-aac3-80737ba5d43e

[ Submitter's Name ] Aram Baghdassarian

[ Space ] journalism [ Secondary Space ] science

[ Format ] demo, learning-lab, hands-on

Description

Our session will teach people to use sentiment analysis, a branch of machine learning, to identify whether supposedly neutral news sources are in fact biased towards a certain point of view. We will begin with using machine learning to classify articles from news sources as positive, neutral, or negative. If there is time after this first activity, we will move on to using machine learning to detect if articles possess emotional tones such as tension, depression, anger, hostility, vigor, and fatigue - all through the use of open-source tools. This would create awareness among people about which media sources do not provide fair accounts of news while also teaching them valuable skills in machine learning.

Agenda

We will begin by asking the participants for a media source that they would like to scrutinize. We will then choose an article from that source that both interests the participants and has a potential for bias (politics, social issues, etc.). After we explain how the computer can learn to identify patterns based on sample data it is fed, we will provide the participants with thousands of training sentences. We will explain to them what this training data consists of, how they can create it themselves, and why each sample consists of only one sentence rather than one article. With all participants in possession of training data, we will proceed to discuss what Natural Language Text Processing (NLTP) is and how it is related to our goal. Using text-processing.com’s api for NLTP, we will guide the participants through creating a machine learning program in Python for our task.

Participants

With a group size of <=3, we will be able to work closely with the participants and act as one large group. With any group size >3, the participants will divide into small groups of 2-4 people. The main advantage that groups hold in an activity such as this is the help that teammates can offer if a team member encounters problems. With no teams, it would be time-consuming and unproductive for the session leader to go to every individual and solve their problem. While all requests for help will be addressed no matter what, teams will reduce the amount of requests made and consequently lead to a more productive and information-filled session.

Outcome

At the end of the session and festival, participants will have learned three things:

Whether or not a news source contains bias in its articles.
How Natural Language Text Processing can be used to dissect sentences and break them down into a form that computers can read.
How machine learning works, from analyzing the training data to performing backpropagation to make predictions.

Outcome #1 is a fun and entertaining benefit that participants can apply to other news sources after the session. Outcome #2 can be used in a variety of other applications: Apple’s Siri, Microsoft’s Cortana, Facebook Messenger’s chat bots, and other services all use NLTP to interpret user data and respond like a human. Outcome #3 will prove to be an invaluable experience since machine learning is becoming more and more prominent, from recognizing handwritten addresses at mail centers to guiding Google’s self-driving car.

MozillaFoundation / mozfest-program-2016