azmfaridee / mothur

This is GSoC2012 fork of 'Mothur'. We are trying to implement a number of 'Feature Selection' algorithms for microbial ecology data and incorporate them into mother's main codebase.
https://github.com/mothur/mothur
GNU General Public License v3.0
3 stars 1 forks source link

Week 6: Implement the Core Part of Random Forest Algorithm #15

Closed azmfaridee closed 12 years ago

azmfaridee commented 12 years ago

Parent Issue: #3, #14

As the per the initial proposal, code the core part of the Regularized Random Forest algorithm. Specific functions of interests are:

partitionRecursively()
createTree()

Also implement any remaining parts from previous weeks.

End of Week Deliverable:

Given a bootstrapped sample, the code must be able to generate the decision tree that can predict the outcome. Also, preparation of any additional dataset as per issue #2 would need to be done if required.

azmfaridee commented 12 years ago

To cut down the development time and a bit faster R&D, we decided to implement the prototype in python, and then directly implement the the full version in C++ with mothur, as such, we have now moved to python for the prototype implementation. The whole code resides in the file pyrrf.py.

We have implemented the buildDecisionTree() and splitRecursively() function, so we have a decision tree to classify the data.