SJDunkelman / read-all-about-it

🗞 Scrape headline & article text data from major US/UK Newspaper sources for use in NLP projects
GNU General Public License v3.0
1 stars 1 forks source link

WSJ #10

Closed TimAAPL closed 4 years ago

TimAAPL commented 4 years ago

https://www.wsj.com/news/archive/20200416

clear format, date is at the end of the URL

SJDunkelman commented 4 years ago

Which topics shall we restrict this to? We need to standardise what we extract from each news source so we don't double count coverage. For example you have 'U.S' but also 'Tri-state area' and 'U.S. Economy'

SJDunkelman commented 4 years ago

This issue is now complete - can further edit topics by changing topic list in WSJ class