TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

Implement mostly working version of RF. #57

Closed gpestre closed 4 years ago

gpestre commented 4 years ago

The algorithm is in place, but it's not actually random yet (because it always uses a seed of -1 for the bootstrapping).

Below I proposed a SeedGenerator object that is initialized within each RandomForest instance, and provides a stream of random seeds that can be used for (a) bootstrapping and (b) to pass to each tree, which has its own SeedGenerator to select different random subsets of features at each split.

Hope it's not overkill, but seemed like a straighforward way to allow replicability for our debugging, without ending up with the same bootstrap each time. (See more in issue #65 ).