The great ( 🐛 ) hunt: Ch8

[ ] We write: "finding coefficients b0 and b1 that parametrize (correspond to) the line of best fit". Wouldn't it be more accurate to change the parenthesis to "define" or maybe "describe"?
[ ] For fig 8.4 I have the same comment as in the previous chapter: the solid black error distances dominate the figure too much visually. It would be worth trying to make them dashed or dotted, or changing the color to gray.
[ ] A general comment on most chapters is that we don't describe which colors are what in the figure. We usually just say "lines", "dots" etc instead of e.g. "black vertical lines", "blue dots" etc. Not sure if this is needed or not.
[ ] In many figure legends we say just "size" instead of "house size", but we say "sale price" instead of just "price"
[ ] "Our coefficients are (intercept) 15642 and (slope) 137." Maybe format with thousands separator: 15,642 (also in the sentence below this one)
[ ] " In these cases the prediction model from a simple linear regression will underfit (have high bias)". I don't think we have ever mentioned bias or variance previously, so maybe it's unnecessary to introduce it here?
[ ] We call it "multivariable" linear regression, but isn't it more common to use just "multiple" linear regression? "Multivariable" sounds closer to "multivariate" which might be confusing since they are not the same thing.
[ ] Fig 8.7 links to the bottom instead of the top of the fig
[ ] "But if we have already decided on a small number (e.g., 2 or 3) of tuned candidate models and we want to make a final comparison, we can do so by comparing the prediction error of the methods on the test data." Hmm, is this really sound advice? In this case we would be using the test data to make a choice of models, which we shouldn't do, even if we just have 2-3 models, right?
[ ] Should we fix the y-range across figs 8.11 and 8.12 to make it easier to compare and see that the y-values are indeed the same?

UBC-DSCI / introduction-to-datascience-python

The great ( 🐛 ) hunt: Ch8 #328