We should remove outliers recurrently, until pca.explained_variance_ratio_ after PCA seems okay.
This will help in the clustering process.
Clustering always returns a 2 clusters plot with one giant cluster and a small one. This issue might derive from the existence of outliers in our data, which are grouped into one small cluster.
We should remove outliers recurrently, until
pca.explained_variance_ratio_
after PCA seems okay. This will help in the clustering process.Clustering always returns a 2 clusters plot with one giant cluster and a small one. This issue might derive from the existence of outliers in our data, which are grouped into one small cluster.