LillyanPan / AttritionAnalysis

2 stars 1 forks source link

Midterm Peer Review(my463) #5

Open PatrickYuanMingcan opened 6 years ago

PatrickYuanMingcan commented 6 years ago

It is a very good report with clear ideas and rich charts. But when you draw the correlation matrix, it is obvious that there are two features highly correlated, which means you'd better delete one of them and then do the analysis. And when it comes to the Model Selection, Random Forest is too deep to reach 100% training accuracy, which is too high compared with the test accuracy. I think you'd better decrease the depth and select the m again. Last suggestion is that PCA and LASSO are two good methods in regression to avoid overfitting and select the features. You can try them and compare the result with logistic regression.