Open mozhao0331 opened 1 year ago
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Overall, your project is quite comprehensive, with a clear question and detailed analysis. I really like your scripts where each step of the analysis and modeling is divided into different functions, making the code readable.
The usage section could be improved to ensure better reproducibility. First, the download command should have --file_type
set to "csv"
. Second, the command for running the EDA is not the same as the example command in the script.
Also, for the usage section, you might want to put each step's command into an individual code cell for easy copy-pasting in case one step didn't execute. Also, you might want to remove the square brackets for the optional argument (or you can include them without brackets).
It will be better if you add a conclusion section to briefly restate the research question and an overall comment on the models' performance (which you already did in Score Analysis). This makes it easier for the readers to find the take-home message of your project.
One more thing you might consider discussing is the inclusion of the sex
feature, especially in the context of your project. Do you think including this feature will introduce gender bias to your model?
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Your EDA report is well organized. Briefed the variables and the project's purpose very clearly with the introduction and various plots.
The scripts in the src directory include the whole process of the analysis from the data download, EDA, and model training to the model summary. All the results can be located easily in the results directory.
The analysis pipeline is well organized, with various fitting methods, models, and validation scores. It will be better if you add the pros and cons (based on overfitting, CV scores, etc.) of each method you use and state why you used them by combining your research question with the characteristics of each model.
In the Analysis Report, you explained clearly the main focus target of your analysis(lower the Type I and Type II errors). You provided the reason clearly for which scoring metric to use. It's suitable for the trade-off between precision and recall based on this specific real-life question.
There is a little suggestion that you might consider trying some feature engineering and selection work in order to discover more potential features and combinatoins to improve the overall scores of your models.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
This was derived from the JOSE review checklist and the ROpenSci review checklist.
Thank you for taking out time to review our project. Based on the feedback received, we've tried to improve the overall presentation of the report, and the workflow used to generate it.
Some of the key feedbacks and the commits that resolved them are:
Commits that fix the environment.yaml:
Commit that removed the old code cells:
Commits that add contributors and affiliations in both the readme and the final report:
Commits that specify a change in the final report:
Commits that fixed this:
Submitting authors: @mozhao0331 @kenuiuc @Althrun-sun @rkrishnan-arjun
Repository: https://github.com/UBC-MDS/credit_default_prediction_group_20 Report link: https://github.com/UBC-MDS/credit_default_prediction_group_20/blob/main/doc/credit_default_analysis_report.md Abstract/executive summary: For this project we are trying to answer the question:
Given a credit card customer's payment history and demographic information like gender, age, and education level, would the customer default on the next bill payment?"
Answering this question is important because, with an effective predictive model, financial institutions can evaluate a customer's credit level and grant appropriate credit amount limits. This analysis would be crucial in credit score calculation and risk management.
Editor: @flor14 Reviewer: Li Sam, Ganacheva Elena, Feng Yurui, Wijngaarden Renzo