UBC-MDS / data-analysis-review-2023

0 stars 0 forks source link

Submission: <Group 5: Bank Marketing Analysis> #26

Open nassimgha opened 12 months ago

nassimgha commented 12 months ago

Submitting authors: <zhang-shizhe> <Marcony1> <celestezhao> <nassimgha>

Repository: Report link: Abstract/executive summary: In this analysis, we attempt to build a predictive model aimed at determining whether a client will subscribe to a term deposit, utilizing the data associated with direct marketing campaigns, specifically phone calls, in a Portuguese banking institution.

After exploring on several models (logistic regression, KNN, decision tree, naive Bayers), we have selected the logistic regression model as our primary predictive tool. The final model performs fairly well when tested on an unseen dataset, achieving the highest AUC (Area Under the Curve) of 0.899. This exceptional AUC score underscores the model's capacity to effectively differentiate between positive and negative outcomes. Notably, certain factors such as last contact duration, last contact month of the year and the clients' types of jobs play a significant role in influencing the classification decision.

The dataset used in this project originates from the Bank Marketing dataset created by S. Moro, P. Rita and P. Cortez at Iscte - University Institute of Lisbon. This dataset is accessible through the UCI Machine Learning Repository and can be accessed here. Among the four available datasets, we have utilized bank-full.csv which contains all examples and 17 inputs. Each row in the dataset represents an individual client data including the personal details (e.g., age, occupation, loan status, etc.), information regarding their response to the marketing campaign (e.g., outcomes of the previous marketing campaign, number of contacts made during the current campaign, etc.), and the eventual subscription outcome for the term deposit.

Editor: @ttimbers Reviewer: Beth Ou Yang, Michelle Hunn, Gretel Tan, Julia Everitt

beth-ouyang commented 12 months ago

Data analysis review checklist

Reviewer: @beth-ouyang

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1 hr

Review Comments:

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

gtmx23 commented 12 months ago

Data analysis review checklist

Reviewer: @gtmx23

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1.5 h

Review Comments:

Minor Issues with Scripts

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

mishelly-h commented 12 months ago

Data analysis review checklist

Reviewer: @mishelly-h

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 2h

Review Comments:

Overall, I think you did a great job with the report and I enjoyed reading your analysis. In the following some suggestions how you could improve your report even further:

In summary, you have shown that you have conducted thorough research and discovered interesting findings.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

juliaeveritt commented 12 months ago

Data analysis review checklist

Reviewer: @juliaeveritt

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1

Review Comments:

Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.

Great work! Your report is nicely formatted and professional (I like the color scheme you chose for the plots :D ), and your analysis methods look great. I was able to reproduce your analysis with minimal errors.

Small issues I bumped into/suggestions for improvement:

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.