UppuluriKalyani / ML-Nexus

ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
https://ml-nexus.vercel.app/
MIT License
69 stars 122 forks source link

Feature request: Enhance and merge email spam detection notebook with EDA and NLP improvements #371

Closed Niraj1608 closed 1 month ago

Niraj1608 commented 1 month ago

Feature Description Hi, I am Niraj. I've been reviewing the Email Spam Detection with Machine Learning project and noticed several areas where improvements can be made. Specifically, I propose:

Enhanced EDA: Adding more detailed charts and visualizations using Python libraries like Seaborn or Matplotlib will help in better understanding the data distribution and correlation between features. This could include heatmaps, pair plots, and distribution plots to visualize relationships and patterns in the data. Advanced NLP Techniques: Incorporating more Natural Language Processing (NLP) techniques, such as advanced tokenization, lemmatization, and more sophisticated vectorization techniques like TF-IDF. Data Cleaning: Introducing robust data cleaning methods to remove noisy data, handle missing values, and preprocess text data more efficiently will improve the model's accuracy. This would enhance the overall performance of the spam detection model by making it more interpretable and efficient through better visualizations and data processing. Dataset Issue: The project is missing the dataset required for running the notebook. I propose including a well-structured dataset to ensure reproducibility and ease of use for others.

If you like this idea, please assign this task to me, and I will add the corresponding improvements and charts to it.

Thank you for your time and consideration!

Use Case Incorporating the enhanced EDA and advanced NLP techniques from the Spam Mail Predictor notebook will provide better insights into the dataset, leading to more accurate model training and predictions. This is crucial for users looking for deeper analysis and improved model performance.

Benefits Improved EDA: The Spam Mail Predictor notebook features more detailed EDA, including additional visualizations and insights. Enhanced NLP: It also includes more advanced NLP techniques, such as TF-IDF and more extensive text preprocessing steps. Dataset Integration: Adding a clear, usable dataset to the notebook will ensure reproducibility and ease of use for others.

github-actions[bot] commented 1 month ago

Thanks for creating the issue in ML-Nexus!🎉 Before you start working on your PR, please make sure to:

github-actions[bot] commented 1 month ago

Hello @Niraj1608! Your issue #371 has been closed. Thank you for your contribution!

Niraj1608 commented 1 month ago

@UppuluriKalyani @Neilblaze @Neilblaze I see that the issue has been closed without an explanation. Please let me know why. I plan to add some functionalities to this existing repository.

UppuluriKalyani commented 1 month ago

@Niraj1608 can you come up with good ideas cuz email spam detector won't be a good project ig.

Niraj1608 commented 1 month ago

@UppuluriKalyani I think there’s been a misunderstanding. I’m not creating a new project; the email spam detector already exists. I’m simply proposing to update it with new features like advanced NLP techniques and enhanced data visualization. i attached my code for review spam_mail_pridector.ipynb - Colab.pdf

UppuluriKalyani commented 1 month ago

@Niraj1608 I understand your concern still don't need it...you can raise good issues(ideas)

Niraj1608 commented 1 month ago

@UppuluriKalyani can you reopened https://github.com/UppuluriKalyani/ML-Nexus/pull/344 this issue i want to add deployment feature in it :)

Niraj1608 commented 1 month ago

@UppuluriKalyani hey you opened the spam mail detection pr which u said dont work on it i was talking about COVID-19 Medical Face Mask Detection #315

github-actions[bot] commented 1 month ago

Hello @Niraj1608! Your issue #371 has been closed. Thank you for your contribution!

UppuluriKalyani commented 1 month ago

@

@UppuluriKalyani hey you opened the spam mail detection pr which u said dont work on it i was talking about COVID-19 Medical Face Mask Detection #315

It's already done right

Niraj1608 commented 1 month ago

@

@UppuluriKalyani hey you opened the spam mail detection pr which u said dont work on it i was talking about COVID-19 Medical Face Mask Detection #315

It's already done right

yes, it's already done, but I'm of the opinion that we could still take it further by including the Streamlit deployment to improve visualization and user interaction