The-Data-Alchemists-Manipal / MindWave

MindWave is an open-source project designed for beginners to learn about data science, machine learning, deep learning, and reinforcement learning algorithms using Python. The project offers a platform for implementing relevant algorithms, with open-source tools and libraries.
MIT License
97 stars 144 forks source link

Credit Card Fraud Detection #132

Closed sujanrupu closed 1 year ago

sujanrupu commented 1 year ago

💥 Proposal (GSSOC 23)

A machine learning algorithm to recognize fraudulent credit card transactions so that the customers of credit card companies are not charged for items that they did not purchase.

Main challenges involved in credit card fraud detection are:

  1. Enormous Data is processed every day and the model build must be fast enough to respond to the scam in time.
  2. Imbalanced Data i.e most of the transactions (99.8%) are not fraudulent which makes it really hard for detecting the fraudulent ones
  3. Data availability as the data is mostly private.
  4. Misclassified Data can be another major issue, as not every fraudulent transaction is caught and reported.
  5. Adaptive techniques used against the model by the scammers.
tanujbordikar commented 1 year ago

I would like to work on this issue but can you elaborate more on what I need to do. I know that XGBoost algorithm to train for this kind of dataset works excellent with 97 percent accuracy and hope it will remove that oversampling problem.

sujanrupu commented 1 year ago

@theyashwanthsai @khusheekapoor as a GSSOC '23 contributor, I want to work on this project..

theyashwanthsai commented 1 year ago

@tanujbordikar since we are following the first-come-first-serve policy, we will not be able to assign you this issue. However, you can create another issue and use the same algorithm on a different dataset.

theyashwanthsai commented 1 year ago

@sujanrupu Please specify the ml model which you are going to train

sujanrupu commented 1 year ago

@theyashwanthsai I will use random forest classifier on the dataset present in kaggle. Dataset link: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

sujanrupu commented 1 year ago

@theyashwanthsai may I start working on the issue?

theyashwanthsai commented 1 year ago

@sujanrupu you can go ahead! We are assigning you 21 days for this project, after which it will be assigned to someone else if not completed. All the best! Name the file as: algorithm_dataset.ipynb and link it in the readme of the labeled directory as algorithm - dataset.