Fitting a linear regression model in Python

oldoc63 commented 1 year ago

There are a number of Python libraries that can ve used to fit a linear regression, but in this course, we will use the OLS.from_formula() function from statsmodels.api because it uses simple sintax and provides comprehensive model summaries.

Suposse we have a dataset named body_measurements with columns height and weight. If we want to fit a model that can predict weight based on height, we can create the model as follows:

model = sm.OLS.from_formula('weight ~ height', data=body_measurements)

We used the formula 'weight ~ height' because we want to predict weight (it is the outcome variable) using height as a predictor. Then, we can fit the model using .fit():

results = model.fit()

Finally, we can inspect a summary of the results using print(results.summary()). For now, we'll only look at the coefficients using results.params, but the full summary table is useful because it contains other important diagnostic information:

print(results.params)

Intercept -21.67 height 0.50 dtype: float64

This tell us that the best fit intercept is -21.67, and the best fit slope is 0.50.

oldoc63 commented 1 year ago

Using the students dataset that has been loaded in script.py create a linear regression model that predicts student score using hours_studied as a predictor and save the result as a variable named model.

oldoc63 commented 1 year ago

Fit the model using the .fit() method on model, and save the fitted model as results.

oldoc63 / learningDS

Fitting a linear regression model in Python #468