UChicago-Computational-Content-Analysis / Frequently-Asked-Questions

0 stars 0 forks source link

Week 5 real data #13

Open thisspider opened 2 years ago

thisspider commented 2 years ago

Hi all,

In the homework notebook for week 5, we are asked to use real datasets that are said to be "available." I am speaking of the section "Now we do the same for real data" for Exercise 2.

I am unable to load the data and I get the following error:

      FileNotFoundError: [Errno 2] No such file or directory: '../data/reddit.csv'

How do I load the data? Does the notebook contain instructions about loading the data which I have missed?

Thanks!

jacyanthis commented 2 years ago

Hi Tytus, the ../data/reddit.csv file is in the GitHub repo. If you have not cloned that repo (in Drive or on your local machine), you can directly download it here: https://github.com/UChicago-Computational-Content-Analysis/Homework-Notebooks/blob/main/data/reddit.csv

It is also loadable with the line, dfTrain, dfTest = sklearn.model_selection.train_test_split(lucem_illud.loadReddit(), test_size=.2), though the notebook does not save it to that CSV location.