Closed azmfaridee closed 12 years ago
@kdiverson @mothur-westcott As I mentioned earlier, evaluateSample()
function seems to be our bottleneck. Any way optimizing this or pruning the tree (that wold cut down the recursion depth) would speed up the program.
int evaluateSample(vector<int> testSample) {
TreeNode *node = rootNode;
while (true) {
if (node->checkIsLeaf() == true) { return node->getOutputClass(); }
int sampleSplitFeatureValue = testSample[node->getSplitFeatureIndex()];
if (sampleSplitFeatureValue < node->getSplitFeatureValue()) { node = node->getLeftChildNode(); }
else { node = node->getRightChildNode(); }
}
}
@kdiverson You said earlier that you have the book C4.5: Programs for Machine Learning by J. Ross Quinlan in library. Could you try to find out if there is anything on Tree Pruning in the book and let me know? I think Tree Pruning would be one of ways to optimize the implementation.
@darthxaher I'll try to head over to the engineering school this week and find that book, sorry I didn't get to this before.
EDIT: actually it looks like I can have chapters of the book emailed to me. Can you look at the table of contents [0] and tell me what pages (inclusive) you want?
EDIT: actually it looks like I can have chapters of the book emailed to me. Can you look at the table of contents and tell me what pages (inclusive) you want?
@kdiverson I would like to have a look on chapter 4 (Pruning Decision Trees). But from what I've read from chapter 4 in Google books, there is reference of Chapter 3 and 8 where the author discusses more details about pruning, Interestingly, Chapter 8 isn't there in the TOC. So, basically chapter 4 is the most important thing we need now.
Weekly Update Issues: #3, #14, #15, #16, #17, #19, #20, #21 and #23
The title says it all. By this time we should have a mock class that we can call from mothur's command line. We also have a C++ implementation of the random forest algorithm. We now need to make them work together.
Other Issue Related to Mothur Integration: #4, #5, #6 and #7