Investigate the Best Practices for Parameter Tuning/Estimation for Random Forest

Once the code has been implemented, we'd need to do parameter tuning to find the sweet spot between speed and performance. The major parameters that we might need to tune are.

Number of trees
Number variables in the subspace. If our train data has N variables/attributes but in each split of the tree we’d need to consider a subset of them, with a number P. We’d need to tune the parameter.
Number of nodes in trees
Entropy (information gain) criteria: Tan-Steinbach-Kumar’s Data Mining textbook has a decent guide for setting this. We'd want to follow that lead.

We'd like to gather more resources from the web and investigate the best practices out there for these parameter tuning.

azmfaridee / mothur

Investigate the Best Practices for Parameter Tuning/Estimation for Random Forest #10