Closed Jen-uis closed 3 months ago
Completed Customer-Segmentation-Prediction
using Three methods:
For more information, check the latest updated Project-Code.ipynb
.
The results are in LogReg_Segmentation RFC_Segmentation GBC_Segmentation A 713 629 718 B 244 568 472 C 783 638 609 D 887 792 828
Seems like the consistency is very low. Looking for ways to test which methods can be trusted.
Update: I used Cross-Validation methods to test which methods can achieve a higher score in predicting the target variable. Results are: | Logistic Regression CV | Random Forest Classifier CV | Gradient Boosting Classifier CV |
---|---|---|---|
0.4954 | 0.4857 | 0.534 |
It seems like Gradient Boosting Classifier can return somewhat good results to predict the segmentation of a new customer.
done with final review
We used regression and regressor in the previous example, in this analysis, we are going to use classification to predict customer segments that is listed in
train_data
. However, this approach cannot be given an accuracy score as thetest_data
does not provide any segments to verify the accuracy. We are going to create a new column undertest_data
to see which customers are categorized in to which segments. Lets go:Use of techniques: Logistic Regression / Random Forest Classifier / Gradient Boosting Classifier (Can possibly use Support Vector Machine, but we will verify how the first three techniques perform)