cs109 / a-2017

Public Repository for cs109a, 2017 edition
http://cs109.github.io/a-2017
324 stars 461 forks source link

Simpler models should be selected in Forward/Backward Selection in Lecture_6_Notebook.ipynb #7

Open covuworie opened 6 years ago

covuworie commented 6 years ago

I've spotted two small bugs in Lecture_6_Notebook.ipynb in the Forward Selection and Backward Selection code. There are 3 models in both cases where the feature sets in have exactly the same value of R squared and AIC respectively.

In both cases, the model with the the largest number of features is selected. Really, in accordance with Occam's razor, we should favor the simplest model and select the model with the smallest number of features.

  1. Forward Selection code should read:
best_predictor_set = sorted(predictors, key=lambda t: t[1], reverse=True)[0]
  1. Backward Selection code should read:
best_predictor_set = sorted(predictors, key=lambda t: t[1], reverse=True)[-1]

PS: I would have submitted a pull request, but wasn't sure you would want it as the output in the notebook would change.