Team 3 COVID 19 data challenge
This repo represents work that Team 3 is doing in tackling the socqe data challenge.
Original dataset obtained through webhose.io: DATASET3: ENGLISH NEWS ARTICLES THAT MENTION "CORONA VIRUS" OR "CORONAVIRUS" OR "COVID" (BY WEBHOSE.IO) Link: https://webhose.io/free-datasets/news-articles-that-mention-corona-virus/
AllSides data (bias data) obtained using work by harry-wood and sautumn: https://github.com/harry-wood/AllSides-Scraper/tree/update-scraper
NOTE: you must update the paths in order for this to work. This includes the input and output paths!
Order of the scripts used to clean the data:
rename.rb
clean.rb
create_table.sql
command by command in the psql
UNIX utility.To get access to the transformed main data, you should the csv data.