TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

adding cancer and hmeq datasets #113

Closed hgupta18 closed 4 years ago

hgupta18 commented 4 years ago

Adding two datasets, associated notes, and tests. We should use the hmeq_clean.csv dataset (3445 observations, 11 predictors). It's a binary classification dataset of defaulted home equity loans.

I get an assertion error sometimes when I run the random_forest script. Creating a new issue about that.