oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Covariance #405

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

Beyond visualizing relationships, we can also use summary statistics to quantify the strength of certain associations. Covariance is a summary statistic that describes the strength of a linear relationship. A linear relationship is one where a straight line would best describe the pattern of points in a scatter plot.

Covariance can range from negativity infinity to positive infinity. A positive covariance indicates that a larger value of one variable is associated with a larger value of the other. A negative covariance indicates a larger value of one variable is associated with a smaller value of the other. A covariance of 0 indicates no linear relationship.

To calculate covariance, we can use the cov() function from NumPy, which produces a covariance matrix for two or more variables. Notice that the covariance appears twice in this matrix.

oldoc63 commented 1 year ago

Use the cov() function from NumPy to calculate the covariance matrix for the sqfeet variable and the beds variable. Save the covariance matrix as cov_mat_sqfeet_beds. Print out the value stored in the variable cov_mat_sqfeet_beds. Look at the covariance matrix and find the covariance of sqfeet and beds.