asadoughi / stat-learning

Notes and exercise attempts for "An Introduction to Statistical Learning"
http://asadoughi.github.io/stat-learning
2.12k stars 1.62k forks source link

Question 9e (applied.html) Chapter 03 #67

Open faridcher opened 8 years ago

faridcher commented 8 years ago

Q: Use the * and : symbols to fit linear regression models with interaction effects. Do any interactions appear to be statistically significant?

Your answer: From the correlation matrix, I obtained the two highest correlated pairs and used them in picking my interaction effects. From the p-values, we can see that the interaction between displacement and weight is statistically signifcant, while the interactiion between cylinders and displacement is not.

Interaction is not relevent to the correlation. The interaction determines the influence of one predictor on the effectiveness of the other predictor on the response which is different from correlation. We have cases where there is not correlation but there is interaction. For example in the Advertising dataset:

a = read.csv("data/Advertising.csv")
pairs(a)

No strong correlation between TV and Radio but there is interaction!

In addition, a better way to evaluate the interaction effect is to use .*. in the lm function; It will include all the predictors and the combination of all possible binary interactions:

sset = subset(Auto, select=-name)
fit = lm(mpg~.*.,sset)
summary(fit)

Note that the decision to include any interaction term in the final model is under the topic of model selection.

Thanks