amueller / scipy-2016-sklearn

Scikit-learn tutorial at SciPy2016
Creative Commons Zero v1.0 Universal
515 stars 516 forks source link

!!! No titanic data #62

Closed rasbt closed 8 years ago

rasbt commented 8 years ago

The titanic dataset seems to be missing (notebook 10). Do you have it on your local drive? It's probably pretty small so we could just add it to the repo instead of fetching it online.

amueller commented 8 years ago

I'm adding this. but I think we should use pandas to read it.

rasbt commented 8 years ago

Okay thanks! I am okay with pandas, but before adding another dependency, numpy's loadtxt or genfromtxt would have issues?

amueller commented 8 years ago

yeah they don't work on python3 basically ^^

amueller commented 8 years ago

ok I really don't understand the titanic notebook right now. I thought it would deal with missing values and categorical variables but it does not. wtf?

rasbt commented 8 years ago

Okay then! :). I think they would also have issues with the mixed types to some extend. pandas would be easier I agree

rasbt commented 8 years ago

Hm, I can't remember if there was a section on these things ...

amueller commented 8 years ago

there is not