sta-363-s20 / community

Discussion, Q&A, everything you want to say, formatted nicely
1 stars 0 forks source link

Lab 06 - Q1&3 #77

Closed harrisonpopp closed 4 years ago

harrisonpopp commented 4 years ago

For question one, I think I'm close to tuning the tree depth but can't get past this one error and also one I do that I don't understand how to examine the results. My code is attached. Another question I have for Q3 is how do we incorporate tree depth because there is no prompt for tree depth in your code for random forests, or I missed it.

Screen Shot 2020-04-16 at 1 51 49 PM
LucyMcGowan commented 4 years ago

The first question is actually asking for a bagged tree, you need to use the rand_forest() function to do that (double check the slides here: https://sta-363-s20.lucymcgowan.com/slides/18-random-forest-in-r.html#6)

LucyMcGowan commented 4 years ago

You don't need to specify tree depth for random forest or bagged trees

harrisonpopp commented 4 years ago

Oh okay I see what I did I misread that. How to I tune the overall number of trees? Is that done is the bagging process or when I am collecting the data in Q2?

LucyMcGowan commented 4 years ago

The parameter you are tuning is trees (so in your code above, you are using decision_tree, that should be rand_forest, you are tuning tree_depth should be tuning trees)

nuripark10 commented 4 years ago

Do we need to split the data into testing and training?

LucyMcGowan commented 4 years ago

You do not this time, just use cross validation (and report the cross validation error)

LucyMcGowan commented 4 years ago

You can split the data, in which case you'd want the final statistics you report to be from the test part of the split, but since cross validation is estimating this testing error, you can just use that as well.