dssg / police-eis

DSaPP police early intervention system: using machine learning to predict adverse incidents
Other
50 stars 20 forks source link

Categorical features #82

Open jonkeane opened 8 years ago

jonkeane commented 8 years ago

Categorical features can't be used by sklearn models without some kind of transformation. Because there are a number of different methods for (reference and an explanation using patsy)

We should encode categorical features at model time, for example using scikit learn's OneHotEncoder.

Even if no other codings are used other than onehot, implementing categorical features in this way we are not reimplementing what already exists in sklearn/patsy by hand.

jonkeane commented 8 years ago

This seems to have been largely implemented at https://github.com/dssg/police-eis/blob/develop/eis/dataset.py#L329-L357