Closed akashlogics closed 1 week ago
Thank you for creating this issue! 🎉 We'll look into it as soon as possible. In the meantime, please make sure to provide all the necessary details and context. Your contributions are highly appreciated! 😊
Thanks for raising this issue! However, we believe a similar issue already exists. Kindly go through all the open issues and ask to be assigned to that issue.
Project Overview
• Created a machine learning model that detects/classifies a SMS into SPAM or HAM (normal) based on the textual data using Natural Language Processing.
• Engineered features like word_count, contains_currency_symbol, and contains_number from the text SMS.
How will this project help?
• This project helps in filtering/cleaning the SMS from the phone.
Resources Used
• Packages: pandas, numpy, sklearn, matplotlib, seaborn, nltk.
• Dataset by UCI Machine Learing on Kaggle: https://www.kaggle.com/uciml/sms-spam-collection-dataset
Exploratory Data Analysis (EDA)
• Exploring NaN values in dataset
• Plotted countplot for SMS labels Spam vs. Ham
Feature Engineering
• Handling imbalanced dataset using Oversampling
• Creating new features from existing features e.g. word_count, contains_currency_symbol, contains_numbers, etc.