MatthewChatham / headlines

headlines stuff
0 stars 2 forks source link

Incorporate news APIs #8

Open MatthewChatham opened 6 years ago

MatthewChatham commented 6 years ago

We need to investigate the APIs available to us.

NYTimes: https://developer.nytimes.com Allows you to set date ranges and search terms, returns WAY more than I'm scraping: https://developer.nytimes.com/article_search_v2.json#/Documentation/GET/articlesearch.json

general api: https://newsapi.org/

news API looks like the best bet

MatthewChatham commented 6 years ago

Is there a reason to use scraping over an API, other than the ability to practice and show off some skills?

Probably not, if the only goal is to produce a machine learning dataset. If I wanted to study some questions about the behavior of these publications' online outlets it would make sense to scrape them (e.g., to learn how long articles tend to stay up).