ML1 - Random Forest Deep Dive Lesson 2

Learning about metrics, loss functions, and (perhaps the most important machine learning concept) overfitting. We discuss using validation and test sets to help us measure overfitting.

Then we’ll learn how random forests work - first, by looking at the individual trees that make them up, then by learning about “bagging”, the simple trick that lets a random forest be much more accurate than any individual tree.

Next up, we look at some helpful tricks that random forests support for making them faster, and more accurate.

Bulldozers Kaggle dataset

14mins: R2 is a measure of score for random forest: Returns the coefficient of determination R^2 of the prediction. In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variance in the dependent variable that is predictable from the independent variable(s). source: Coefficient_of_determination If R2 is less than 0 then the model is worse than predicting the mean. R2 is anything less than 1 with 1 being the best score for the model.

A great github resource: https://github.com/cedrickchee/knowledge/blob/master/courses/fast.ai/machine-learning/2017-edition/lesson-2-random-forest-deep-dive.md

datalass1 / fastai

ML1 - Random Forest Deep Dive Lesson 2 #16

Bulldozers Kaggle dataset