INFO523-S24 / project-final-MiningMinds

https://info523-s24.github.io/project-final-MiningMinds/
0 stars 0 forks source link

Proposal peer review by KG Competitors. #2

Closed KommareddyMonicaTejaswi closed 5 months ago

KommareddyMonicaTejaswi commented 5 months ago

The following is the peer review of the project proposal by [KG Competitors]. The team members who participated in this review are:

The data consists of 550,000 credit card transactions made by European cardholders in 2023. It contains transaction details such as Transaction ID, time, location, amount, several deidentified variables, and a class label associated with the type of transaction.

  1. Explores the effectiveness of Random Forest, XGBoost, etc.. and compares them with ensemble techniques like stacking, bagging, boosting, etc.
  2. Build a meta-classifier to check whether there is any improvement in fraud detection from base classifiers.

1: Model Comparison Metrics: Clarify which performance metrics will be used to compare the models and justify the selection of these metrics. Explain how these metrics effectively identify this task's 'best model'. While the ROC Curve is mentioned, its limitations in the context of imbalanced classification should be addressed by incorporating additional, more suitable metrics.

2: Class Imbalance Strategy: The inherent class imbalance in credit card transaction datasets is not addressed. Specify the strategies that will be employed to manage this imbalance, enhancing the models' ability to accurately detect anomalies.

3: Stacked Generalization Considerations: Detail the criteria for including models in the stacked generalization ensemble. Address the potential for some models to negatively impact the overall performance and how this will be evaluated and mitigated.

4: Clarity on the Second Research Question: The description of the second research question lacks clarity. Provide a concise and clear statement of the question, outlining the objectives and how it contributes to the research goals.

  1. Synthetic Minority Over-sampling Technique (SMOTE), enhancing the model's sensitivity to minority classes.
  2. The evaluation of models for imbalanced binary classification incorporates metrics like AUC-PR, MCC, and the F1-MCC plot, offering a comprehensive view of performance beyond conventional accuracy.
  3. Shapley Additive explanations (SHAP) are employed for model interpretation, identifying key features that significantly influence the detection of credit card fraud, thereby improving model transparency and effectiveness
  1. Strategies for Managing Class Imbalance.
  2. How model ensembling improves the classification task.