JensBender / hate-speech-detection

Employing deep learning techniques to train and deploy a hate speech detection model for social media comments.
MIT License
4 stars 2 forks source link
bert-fine-tuning data-science deep-learning hate-speech-detection machine-learning natural-language-processing neural-networks python


Deep learning for hate speech detection in social media comments.


Table of Contents

  1. About The Project
  2. Motivation
  3. Data
  4. Model Building
  5. Model Performance
  6. Model Deployment
  7. Getting Started
  8. Appendix

About The Project

Summary

Built With

(back to top)

Motivation

(back to top)

Data

Comment length histogram

(back to top)

Model Building

Benchmark models (Mollas, Chrysopoulou, Karlos, & Tsoumakas, 2022):

Comparison of three deep learning models:

(back to top)

Model Performance

Accuracy

SimpleRNN LSTM Fine-Tuned BERT
Training Accuracy 91.8% 100% 99.9%
Test Accuracy 66.3% 70.7% 78.0%

(back to top)

Classification Report

The following classification reports present the performance metrics of the trained models on the test data.

SimpleRNN Precision Recall F1 Score
No Hate Speech 0.69 0.71 0.70
Hate Speech 0.63 0.61 0.62
Accuracy 0.66
LSTM Precision Recall F1 Score
No Hate Speech 0.73 0.75 0.74
Hate Speech 0.68 0.66 0.67
Accuracy 0.71
Fine-Tuned BERT Precision Recall F1 Score
No Hate Speech 0.84 0.74 0.79
Hate Speech 0.72 0.83 0.77
Accuracy 0.78

(back to top)

Confusion Matrix

Confusion matrices of the three models based on the test data. SimpleRNN LSTM Fine-Tuned BERT
Model 1 Confusion Matrix Model 2 Confusion Matrix Model 3 Confusion Matrix

(back to top)

Illustrative Examples

Here are some example comments from the test data along with their true label and model predictions.

True Label SimpleRNN LSTM Fine-Tuned BERT
Women deserve to be abused, I guess. Hate Hate Hate Hate
Christians are all demonic. Hate No Hate Hate Hate
Muslims want war! So take it to them first Hate Hate No Hate Hate
Fat lesbians Hate Hate Hate Hate
What imaginary gender are you? Hate No Hate No Hate Hate

(back to top)

Model Deployment

I deployed the fine-tuned BERT model through both a web application and an API.

Web Application

Utilizing the Flask framework, HTML, and CSS, I created a web application with a user-friendly interface for users to input text and receive predictions on whether it contains hate speech.

Deployment example 1 Deployment example 2

API

I developed an API endpoint to enable integration with other applications or services by leveraging the Flask framework and utilized Postman for testing and documenting the API.

API documentation: See here

Model deployment API

(back to top)

Getting Started

Prerequisites for Model Training

This is a list of the Python packages you need.

Prerequisites for Model Deployment

This is a list of the Python packages you need.

To enhance security, create a .env file and create a secret key for the Flask application. Store the secret key in the .env file and utilize the python-dotenv library to retrieve it.

  SECRET_KEY = "Your_secret_key_here"

(back to top)

Appendix

SimpleRNN: Preprocessing, Model Architecture and Hyperparameters

Preprocessing
Tokenizer vocabulary size: 5000
Padded sequence length: 15
Embedding dimension: 50

Model Architecture Layer (type) Output Shape Param # Activation
Embedding (None, 15, 50) 250050
SimpleRNN (None, 15, 128) 22912 tanh
SimpleRNN (None, 128) 32896 tanh
Dense (None, 64) 8256 relu
Dense (None, 1) 65 sigmoid

Total params: 314,179
Trainable params: 314,179
Non-trainable params: 0

Hyperparameters
Optimizer: Adam
Learning rate: 0.001
Loss: Binary Crossentropy
Epochs: 100
Batch size: 8
Dropout rate: 50%
Early stopping metric: Accuracy

(back to top)

LSTM: Preprocessing, Model Architecture and Hyperparameters

Preprocessing
Tokenizer vocabulary size: 5000
Padded sequence length: 150
Embedding dimension: 50

Model Architecture Layer (type) Output Shape Param # Activation
Embedding (None, 150, 50) 250050
LSTM (None, 150, 128) 91648 tanh
LSTM (None, 128) 131584 tanh
Dense (None, 64) 8256 relu
Dense (None, 1) 65 sigmoid

Total params: 481,603
Trainable params: 481,603
Non-trainable params: 0

Hyperparameters
Optimizer: Adam
Learning rate: 0.001
Loss: Binary Crossentropy
Epochs: 100
Batch size: 32
Dropout rate: 50%
Early stopping metric: Accuracy

(back to top)

Fine-Tuned BERT: Preprocessing, Model Architecture and Hyperparameters

Preprocessing
Text preprocessing for BERT models: https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3

Model Architecture Layer (type) Output Shape Param # Activation
Text Input [(None,)] 0
Preprocessing input_type_ids: (None, 128)
input_mask: (None, 128)
input_word_ids: (None, 128)
0
BERT (None, 512) 28763649
Dropout (None, 512) 0
Dense (None, 128) 65664 relu
Dense (None, 1) 129 sigmoid

Total params: 28,829,442
Trainable params: 28,829,441
Non-trainable params: 1

Hyperparameters
Optimizer: Adam
Learning rate: 0.0001
Loss: Binary Crossentropy
Epochs: 100
Batch size: 8
Dropout rate: 50%
Early stopping metric: Accuracy

(back to top)