There are a situation,if my data have a feature with 100% missing values, or threshold like 98% missing values, call identify_collinear() will get more features with a correlation magnitude greater than the correlation_threshold.
I cheaked the result of pd.DataFrame.corr(), there were high correlation between some features and the feature with 98% missing values. So when call identify_all(),we will remove more features. We should removed the features with greater than threshold mising values at first, and then identify collinear. May be there are some better strategys.
There are a situation,if my data have a feature with 100% missing values, or threshold like 98% missing values, call identify_collinear() will get more features with a correlation magnitude greater than the correlation_threshold.
I cheaked the result of pd.DataFrame.corr(), there were high correlation between some features and the feature with 98% missing values. So when call identify_all(),we will remove more features. We should removed the features with greater than threshold mising values at first, and then identify collinear. May be there are some better strategys.