Open DSP137 opened 9 years ago
Regarding your first question, yes you are right about underfitting bring more bias but less variance and overfitting bringing less bias but more variance. Remember that one has to change at the expense of the other due to bias-variance tradeoff. Regarding the k-1 and k+1 blocks, I believe that any k less than k* can be used for underfitting, while any k greater than k* can be used for overfitting. k-1 and k+1 are just a simple case of this.
So the true parameter k* is the true number of clusters and we say the model is underfit if we use k-1 blocks and overfit if we use k+1 blocks. Underfit introduces more bias because we are losing valuable information while overfit introduces less bias (and more variance?) by splitting up the nodes more than necessary. Is this correct? Also, we talked about oracle risk and empirical risk, both of which require the use of k*; do we actually know this value? If so, is this the minimizer we got on the graph at the beginning of the talk (the one that modeled number of clusters against the AICc)?