UppuluriKalyani / ML-Nexus

ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
https://discord.gg/n2D4RqnU
MIT License
47 stars 82 forks source link

HAM (normal) based on the textual data using Natural Language Processing. #164

Closed akashlogics closed 1 week ago

akashlogics commented 1 week ago

Project Overview

• Created a machine learning model that detects/classifies a SMS into SPAM or HAM (normal) based on the textual data using Natural Language Processing.
Engineered features like word_count, contains_currency_symbol, and contains_number from the text SMS.

How will this project help?

• This project helps in filtering/cleaning the SMS from the phone.

Resources Used

• Packages: pandas, numpy, sklearn, matplotlib, seaborn, nltk.
• Dataset by UCI Machine Learing on Kaggle: https://www.kaggle.com/uciml/sms-spam-collection-dataset

Exploratory Data Analysis (EDA)

Exploring NaN values in dataset
Plotted countplot for SMS labels Spam vs. Ham

Feature Engineering

• Handling imbalanced dataset using Oversampling
SpamVsHam
Creating new features from existing features e.g. word_count, contains_currency_symbol, contains_numbers, etc.
word_count
currency_numbers

github-actions[bot] commented 1 week ago

Thank you for creating this issue! 🎉 We'll look into it as soon as possible. In the meantime, please make sure to provide all the necessary details and context. Your contributions are highly appreciated! 😊

github-actions[bot] commented 1 week ago

Thanks for raising this issue! However, we believe a similar issue already exists. Kindly go through all the open issues and ask to be assigned to that issue.