Open knageswara78 opened 5 years ago
import numpy as np
threshold = 0.9
corr_matrix = df.corr().abs()
upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))
to_drop = [column for column in upper.columns if any(upper[column] > threshold)] print(to_drop) # These variables are correlated.
df.drop(df.columns[to_drop], axis=1)
Correlation among 2 variables.