tegge-classroom / STAT2984-2018

STAT 2984: Statistical Programming I - Spring 2018
1 stars 13 forks source link

Line of best fit #32

Open bdonovan7 opened 6 years ago

bdonovan7 commented 6 years ago

Does anyone know how to make a line of best fit over a scatter plot?

matter698no2 commented 6 years ago

It took a little bit of googling, but I think I got something working. It'll look like this, I'm not really sure how to change colors yet, but it's something. download

You'll need matplotlib and numpy, so import those however you want and get your data in the format you need. So everything should look a bit like this:

import matplotlib.pyplot as plt
import numpy as np

#use loops to put the data into lists
pressure = [......]
windspeed = [......]

Numpy has polyfit for making a line of best fit. The issue is that it returns a really weird format that's hard to put into the plt.plot() function. So you'll have to use another numpy function, poly1d.

best_fit = np.polyfit(pressure, windspeed, 1) 
fit_func = np.poly1d(best_fit)

polyfit takes and x value (in this case, pressure), a y value (wind speed), and a dimension (only 1 here). It'll return an array that I don't really understand.

poly1d takes the array that polyfit spits out and turns it into a y value that's easy to put on a graph.

So now you've got the regression line, you just gotta put it on a graph.

plt.plot(pressure , windspeed, '.', pressure , fit_func(pressure), '-')
plt.show()

That'll make the graph and overlay a regression line. "pressure" and "windspeed" are your x and y values. the '.' will make the points display as dots. The next parts refer to the line. "pressure", tells the computer the x values again, and the fit_func(pressure) takes the x-values and like I said above, turns them into y values. '-' just makes it display as a big connected line.

It should look like the picture at the top

ategge commented 6 years ago

Do you have the scatter plot, and you know the formula of the line of best fit?

If so, you can estimate two points on the line of best fit, plot those two points, and connect the two points with a line.

bdonovan7 commented 6 years ago

Thanks that worked, and to change the color you just add c="green" or whatever color you want to the end of the plt.plot line. plt.plot(pressure_list1 , wind_list, '.', pressure_list1 , fit_func(pressure_list1), '-', c="green")

mbecker3 commented 6 years ago

I had the same code as Matt, but for some reason only one point shows up on my graph.

maxh95 commented 6 years ago

if you want to have the line of best fit as a different color form your points you can do: line = plt.plot(pressure , windspeed, '.', pressure , fit_func(pressure), '-', c = 'red') line.set_color('blue')

That way your points and line are a different color