Call-for-Code-for-Racial-Justice / TakeTwo-DataScience

Call for Code Diverse Representation Problem 3 media bias data science
Apache License 2.0
8 stars 8 forks source link

Add a Jupyter notebook for trying out scikit-learn text analytics (on 21newsgroup data) #4

Closed naokiabe closed 3 years ago

naokiabe commented 3 years ago

This is a Jupyter notebook for trying out the text analytics capability of scikit-learn, in particular using the 21newsgroup data. It basically follows the instructions given in the tutorial documentation: https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html There was some care needed to make sure that the relevant data sets are downloaded into the appropriate directories so that the code runs and produces results as expected.