TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

One-hot encoder #93

Open johannes-kk opened 4 years ago

johannes-kk commented 4 years ago

One-hot-encoder for categorical columns, whether predictor or response.

Multi-class prediction might presumably work with the existing LabelEncoder which makes it a single multi-level class. However, for predictors, this won't cut it unless the categorical variable is in fact ordinal.

When trying to find larger datasets for #92 there were none that only consisted of numerical predictors, hence we might need a one-hot-encoder if we are to use a larger dataset.