HAM (normal) based on the textual data using Natural Language Processing.

akashlogics commented 1 week ago

Project Overview

• Created a machine learning model that detects/classifies a SMS into SPAM or HAM (normal) based on the textual data using Natural Language Processing.
• Engineered features like word_count, contains_currency_symbol, and contains_number from the text SMS.

How will this project help?

• This project helps in filtering/cleaning the SMS from the phone.

Resources Used

• Packages: pandas, numpy, sklearn, matplotlib, seaborn, nltk.
• Dataset by UCI Machine Learing on Kaggle: https://www.kaggle.com/uciml/sms-spam-collection-dataset

Exploratory Data Analysis (EDA)

• Exploring NaN values in dataset
• Plotted countplot for SMS labels Spam vs. Ham

Feature Engineering

• Handling imbalanced dataset using Oversampling
SpamVsHam
• Creating new features from existing features e.g. word_count, contains_currency_symbol, contains_numbers, etc.

currency_numbers

github-actions[bot] commented 1 week ago

Thank you for creating this issue! 🎉 We'll look into it as soon as possible. In the meantime, please make sure to provide all the necessary details and context. Your contributions are highly appreciated! 😊

github-actions[bot] commented 1 week ago

Thanks for raising this issue! However, we believe a similar issue already exists. Kindly go through all the open issues and ask to be assigned to that issue.

UppuluriKalyani / ML-Nexus