azmfaridee / mothur

This is GSoC2012 fork of 'Mothur'. We are trying to implement a number of 'Feature Selection' algorithms for microbial ecology data and incorporate them into mother's main codebase.
https://github.com/mothur/mothur
GNU General Public License v3.0
3 stars 1 forks source link

Create Code Schematics/Pseudocode for the Random Forest Implementation #9

Closed azmfaridee closed 11 years ago

azmfaridee commented 12 years ago

Here we want to discuss about the code-level details of the Random Forest implementation. I'm copy-pasting our initial code schematics from the application. In the upcoming days, we'd try to elaborate each functions with discussions and pseudocode.

There will be three basic classes that will encompass most the code. They are

Of course additional helper classes will also be needed as we progress through our project schedule. As well as other linking classes that will be used together with mothur. They will be created as necessary. Major Functions will be:

mothur-westcott commented 12 years ago

This looks like a good start to the design. Thinking about the issue of paralellization, where do you see that coming in? Maybe in several places? Without knowing the complexity, and time required for each of the tasks it's hard to see where it would be most valuable. Possible places include for creating the trees, calculating the error rates, calculating individual attributes importance and possibly the bootstrapping? What are your thoughts?

azmfaridee commented 12 years ago

@mothur-westcott : I've created a separate Issue #8 to address this, I'd be adding my thoughts there :)