The-Data-Alchemists-Manipal / MindWave

MindWave is an open-source project designed for beginners to learn about data science, machine learning, deep learning, and reinforcement learning algorithms using Python. The project offers a platform for implementing relevant algorithms, with open-source tools and libraries.
MIT License
95 stars 144 forks source link

GSSoC'23 Hate Speech and Offensive Language Detection #531

Closed AnkitaBarbora closed 1 year ago

AnkitaBarbora commented 1 year ago

Is your feature request related to a problem? Please describe.

Yes, the feature request is related to addressing the problem of hate speech and offensive language online. Hate speech and offensive language can have harmful effects on individuals and communities, contributing to discrimination, harassment, and the spread of negativity in online spaces. Therefore, the development of a machine learning-based solution for hate speech detection aims to mitigate these issues and create a safer and more inclusive online environment.

Describe the solution you'd like

The ideal solution for hate speech and offensive language detection using machine learning would involve the following components:

Describe alternatives you've considered

Rule-based systems: Instead of using machine learning algorithms, a rule-based system can be employed to detect hate speech and offensive language. This approach involves defining a set of predefined rules and patterns that indicate the presence of such language. However, rule-based systems might struggle with capturing nuanced or context-dependent expressions, and they can be less adaptable to evolving language patterns.

Additional context

Dataset to be used: https://www.kaggle.com/datasets/mrmorj/hate-speech-and-offensive-language-dataset

Code of Conduct

khusheekapoor commented 1 year ago

@AnkitaBarbora - you can go ahead! We are assigning you 21 days for this project, after which it will be assigned to someone else if not completed. All the best! Name the file as: algorithm_dataset.ipynb and link it in the readme of the labeled directory as algorithm - dataset.

AnkitaBarbora commented 1 year ago

@khusheekapoor Hello ma'am, I would like to work on the ML model as well as a deployment project for this issue so it'll be going in separate folders, shall I create separate PR for both or will it be considered under the same PR?

khusheekapoor commented 1 year ago

You have to create two separate PRs.

AnkitaBarbora commented 1 year ago

Okay ma'am, thank you.