oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Interpreting a Regression Model #470

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

Let's again inspect the output for a regression that predicts weight on height. The regression line looks something like this:

Image

oldoc63 commented 1 year ago

Note that the units of the intercept and slope of a regression line match the units of the original variables; the intercept of this line is measured in kg, and the slope is measured in kg/cm. To make sense of the intercept (which we calculated previously as -21.67kg), let's zoom out on this plot:

Image

oldoc63 commented 1 year ago

We see that the intercept is the predicted value of the outcome variable (weight) when the predictor variable (height) is equal to zero. In this case, the interpretation of the intercept is that a person who is 0 cm tall is expected to weigh -21 kg. This is pretty non-sensical because it's impossible for someone to be 0 cm tall.

However, in other cases, this value does make sense and is useful to interpret. For example, if we were predicting ice cream sales based on temperature, the intercept would be the expected sales when the temperature is 0 degrees.

To visualize the slope, let's zoom in our plot:

Image

oldoc63 commented 1 year ago

Remember that slope can be thought of as $rise/run$ -the ratio between the vertical and horizontal distances between any two points on the line. Therefore, the slope (which we previously calculated to be 0.50 kg/cm) is the expected difference in the outcome variable (weight) for a one unit difference in the predictor variable (height). In other words, we expect that a one centimeter difference in height is associated with 0.5 additional kilograms of weight.

Note that the slope gives us two pieces of information: the magnitude AND the direction of the relationship between the $x$ and $y$ variables. For example, suppose we had instead fit a regression of weight with minutes of exercise per day as a predictor -and calculated a slope of $-0.1$. We would interpret this to mean that people who exercise for one additional minute per day are expected to weigh 0.1 kg LESS.

oldoc63 commented 1 year ago
  1. The intercept for the OLS regression model predicting score based on hours_studied is 43. That means a student who studied for 0 hours is expected to score a 43 on the test.
oldoc63 commented 1 year ago
  1. The slope for the OLS regression model predicting score based on hours_studied is $9.8$. That means for every additional hour of studying, students are expected to score 9.8 points higher on the test.