UBC-DSCI / introduction-to-datascience

Open Source Textbook for DSCI100: Introduction to Data Science in R
https://datasciencebook.ca/
Other
50 stars 54 forks source link

Matching narrative to the plot in Section 7.7 #440

Closed GloriaWYY closed 1 year ago

GloriaWYY commented 2 years ago

In the Section 7.7 Underfitting and overfitting, the textbook says to fit a KNN regression model with neighbors=932 (the size of the entire dataset); however, in the code, fitting was done on only the training split (size=699). It will cause an error if we really use neighbors=932 since neighbors cannot be larger than the data size. (The code this error because it uses an if/else statement and does not actually fit neighbors=932, instead it takes the mean of all training samples, which is equivalent to using neighbors=699 where 699 is the training sample size).

trevorcampbell commented 1 year ago

I think this should have been closed by #452, but need double-check this

trevorcampbell commented 1 year ago

Yep, should have been closed -- closing now