TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

Load CSV that cleans and converts #74

Closed johannes-kk closed 4 years ago

johannes-kk commented 4 years ago

Extend or create new loading CSV functions to change string columns to numerical (e.g. Sonar dataset stores class as "R" (rock) or "M" (mine), but should be stored as 0 and 1 (not 1 and 2!).

Currently we have a "temp" version of the file in the /data/ directory. Let's wrap up this logic in a function instead so we can easily load other datasets.