This is GSoC2012 fork of 'Mothur'. We are trying to implement a number of 'Feature Selection' algorithms for microbial ecology data and incorporate them into mother's main codebase.
I've already implemented the F-Scoring system, but I'd need to integrate it with RandomForest to see if it can provide any performance improvements.
The F-Score based system is practically a very refined version of the standard deviation (or variance) based feature discarding system that we already have.
@kdiverson I've been reading the paper titled "Combining SVMs with Various Feature Selection Strategies", which was suggested by Rafi, it lists a combined system of SVM+F-Score+RandomForest which looks quite interesting.
I've already implemented the F-Scoring system, but I'd need to integrate it with RandomForest to see if it can provide any performance improvements.
The F-Score based system is practically a very refined version of the standard deviation (or variance) based feature discarding system that we already have.