Computational-Content-Analysis-2020 / frequently-asked-questions

Repo to ask questions and see answers
2 stars 0 forks source link

lucem_illud_2020.loadNewsGroups() Timeout Error #10

Open laurenjli opened 4 years ago

laurenjli commented 4 years ago

Hi, when I try to run lucem_illud_2020.loadNewsGroups() on the RCC, I get a Timeout Error. Given how long each of the empirical datasets takes to load, would it make sense to only look at how classifiers perform on one of the datasets (instead of all) for the analysis in Exercise 1?

bhargavvader commented 4 years ago

Are you maybe running this on compute mode? Either way you can get past this by running a git pull base master which loads the scikit learn data through git. I've posted an announcement about this on Canvas.

Try this and let me know if it works.

laurenjli commented 4 years ago

I thought the Canvas post was about the 1st cell in "Multinomial Naive Bayes", which loads the sklearn 20 NG dataset. Is this the same thing as lucem_illud_202.loadNewsGroups()? Just want to confirm. Thanks!

laurenjli commented 4 years ago

Also, is there a package we should use to open the .pkz file in python?

bhargavvader commented 4 years ago

Hello Lauren, it is the same dataset. As for the .pkz file that scikit learn method manages opening it!