Open gtmx23 opened 12 months ago
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above. This is a great project! Here is some feedback:
You did a great job of explaining the importance of considering false positives vs false negatives in evaluating the performance of the models in the context of the problem.
I think some extra background knowledege could be included. The README.md contains a lot of information about the data and I think this would be good to include in the final report. This would add better context to your analysis.
I think a more concrete conclusion would be appropriate. The report ends with details about the performance of the model. I talking about the model in context of the problem can help to make the report more cohesive and effective. For example, you could mention again why false positives are low-stakes for your problem when presenting your conclusion.
You do a good job of explaining the models and the results. I think it could also be good to briefly explain in your report what you did to preprocess the data or any other steps involved in your pipelines.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
It is good to have a candidate of models to compete against each other. Also the presentation of result is very detailed and comprehensive with different plots showing different models.
Good job noting and handling the class imbalance.
There is only the creative commons license to cover the fair use of the report. You need another one to cover that of the source code (MIT license).
The instruction in your repository, specifically the console commands contain a $ sign at the start of each command. If someone were to copy it using the button into the console it would throw an error. Remove it for a quality of life improvement.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
This is really well written and easy to follow along!
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Submitting authors: @gtmx23 @riyaeliza123 @Owl64901 @charlesxch
Repository: https://github.com/UBC-MDS/Group_7_Project.git Report link: https://ubc-mds.github.io/Group_7_Project/bank_marketing_prediction.html Abstract/executive summary: In this project, we aimed to use customer information from a phone-call based direct marketing campaign of a Portugese banking institution to predict whether customers would subscribe to the product offered, a term deposit. We applied several classification based models (k-NN, SVM, logistic regression and random forest) to our dataset to find the model which best fit our data, eventually settling on the random forest model, which performed the best among all the models tested, with an F-beta score with beta = 5 of 0.82, and an accuracy of 0.677 on the test data.
While this was the best performing model out of the models tested, its accuracy still left much to be desired. This indicates that perhaps more data is needed to accurately predict whether customers would subscribe to the term deposit. Future studies may also consider using more features, a different set of features which might be more relevant to whether customers will subscribe, or utilising feature engineering to obtain features which might be more useful in helping to predict whether customers would subscribe to the service.
Editor: @ttimbers Reviewer: Scout McKee, Rafe Chang, Koray Tecimer, Hongyang Zhang