TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

Find big open source dataset #92

Closed johannes-kk closed 4 years ago

johannes-kk commented 4 years ago

Find a larger open source dataset, preferably with >1,000 observations and >20 features (so mtryvariations make a difference).

https://archive.ics.uci.edu/ml/datasets.php

johannes-kk commented 4 years ago

Binary classification: Adult dataset (Census income) https://archive.ics.uci.edu/ml/datasets/Census+Income

Multiple classification: Foliage Cover dataset https://archive.ics.uci.edu/ml/datasets/Covertype