Open MatthewChatham opened 6 years ago
Is there a reason to use scraping over an API, other than the ability to practice and show off some skills?
Probably not, if the only goal is to produce a machine learning dataset. If I wanted to study some questions about the behavior of these publications' online outlets it would make sense to scrape them (e.g., to learn how long articles tend to stay up).
We need to investigate the APIs available to us.
NYTimes: https://developer.nytimes.com Allows you to set date ranges and search terms, returns WAY more than I'm scraping: https://developer.nytimes.com/article_search_v2.json#/Documentation/GET/articlesearch.json
general api: https://newsapi.org/
news API looks like the best bet