Digital-Ḍād

Arabic Handwriting Recognition 🖋️

Welcome to the Arabic Handwriting Recognition project! This repository consolidates all the code and data used in our endeavor to create an effective OCR (Optical Character Recognition) system for Arabic script.

This project is proudly made by:

Names
Ahmed Taha	Ayman Saber
Ahmed Nagah	Abanoub Aied
Kerollos Samir	Mohamed Abdelfattah
Mohamed Fathi	Nada Asran
Nada Mahmoud	Rawan Gamal
Reem Fouad

Introduction
Features
Installation
Usage
Data
Preprocessing
AI Training
Full System Diagram
Digital-Ḍād ض-الرقمية
Contributing
License
Acknowledgements

Introduction

Arabic Handwriting Recognition is a project aimed at developing a robust system to accurately recognize and digitize Arabic handwritten text. This repository groups together all the scripts, models, and datasets used throughout the development process.

Features

High Accuracy: Utilizing advanced machine learning algorithms to ensure high recognition accuracy.
Customizable Models: Easy to retrain and fine-tune models on new datasets.
User-Friendly Interface: Simple interface for testing and utilizing the OCR system.
Extensive Dataset: Includes a comprehensive dataset of Arabic handwritten text.

Installation

To get started, clone the repository and install the required dependencies:

git clone https://github.com/AHR-OCR2024/Arabic-Handwriting-Recognition.git
cd Arabic-Handwriting-Recognition/Application
pip install -r requirements.txt
npm i

Download the pretrained models from here , Put them in Application/Backend/Models

Usage

To use the OCR system, follow these steps:

Prepare the data: Ensure your data is in the correct format.
Train the model: Use the provided training scripts to train the OCR model.
Evaluate: Test the model on a validation set to check its accuracy.
Run predictions: Use the trained model to recognize text from new handwritten samples.

To Run the application, run each of these commands in a separate terminial

python ./Backend/Backend.py
npm run dev

Data

The dataset used in this project is included in the repository. It contains a variety of Arabic handwritten samples to train and evaluate the model. To use the dataset:

Download the Data.rar file.
Extract the contents to the appropriate directory.

Preprocessing

Preprocessing is a critical stage in the development of our Arabic handwriting recognition system. The goal is to enhance the quality of the input data to ensure accurate recognition. Here are the main steps involved in our preprocessing pipeline:

Image Acquisition: We collect images of handwritten Arabic text from various sources, including scanned documents and photos taken by digital cameras.
Geometric Correction: We correct distortions and warping in the images. Techniques like Hough Line Transform and DocTr (Document Image Transformer) are used to straighten the text lines.
Noise Removal: We apply filters to remove noise and enhance the clarity of the text. This includes techniques like Gaussian blur and median filtering.

Unwarped and filtered image
Segmentation: We segment the images into paragraphs, lines, and individual characters. This involves methods like histogram projection and CRAFT (Character Region Awareness for Text Detection).

Segmented text using CRAFT
Normalization: We normalize the images to a fixed size (64x64) and rescale the pixel values to the range [0, 1] by dividing by 255.0.

Final Results

AI Training

The AI training phase involves developing and training deep learning models to recognize and digitize handwritten Arabic text. Here are the key components of our AI training process:

Prototype Model Experimentation We experimented with three different architectures on a small portion of the data (15,477 Samples) at first to identify the most effective model for our Arabic handwriting recognition system. The comparison of the results is shown in the table below:

Architecture	CER	Accuracy
EfficientNet-B1	7.3%	92.7%
VGG19	5.4%	94.6%
ResNet152	2.96%	97.04%

ResNet152 Performance Throughout the Epochs

Dataset Preparation We use a combination of the Arabic Alphabet Character dataset and the KHATT dataset. The combined final dataset includes 108,619 samples.
Data Augmentation To improve the robustness of our model, we apply various data augmentation techniques, such as rotation, translation, and scaling.
Model Architecture We utilize the ResNet50V2 model, pre-trained on the Arabic Alphabet Character dataset. We then continue training on the KHATT dataset using advanced techniques.

ResNet50V2Alphabet
ResNet50V2 Performance on Alphabet Dataset

Training Techniques
- Optimizer: We use the Adam optimizer with specific parameters for efficient training.
- Learning Rate Scheduler: A cosine learning rate scheduler is employed to adjust the learning rate dynamically during training.
- Training Duration: The model is trained across 70 epochs to ensure convergence and optimal performance.
Evaluation Metrics We use Character Error Rate (CER) and accuracy as our primary evaluation metrics. Our final model achieved a CER of 3% and an accuracy of 97% on the test set.

ResNet50V2 (Pre-Trained on Alphabet) Performance Throughout the Epochs

Full System Diagram

Diagram
The Flow of Our System

Digital-Ḍād ض-الرقمية

Our application provides a variety of services and models:

MainPage
Main Page of Our Application

Handwriting OCR: Utilizing our model simply taking an image of a paragraph written by handwriting, preprocessing the paragraph, and finally performing ocr on the resulting segmented words or sub-words

OCRModel
Handwriting OCR Model

Exam Grading: We pass questiong with their specific answers to the model, and utilizing our OCR methodology alongside a LLM with api, we scan the answers written by a student giving it a grade.

ExamGrader
Exam Grader Model

Document Scanner: Simply scanning an image of a document written by handwriting, performing OCR on it and then assembling the resulting text using a LLM, creating a full digitized version of the document.

Contributing

Contributions are welcome! Please follow these steps to contribute:

Fork the repository.
Create a new branch (git checkout -b feature/your-feature).
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/your-feature).
Open a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

We would like to thank all contributors and the community for their support and feedback. Special thanks to the authors of the datasets and tools used in this project.

Feel free to reach out with any questions or feedback. Let's make Arabic handwriting recognition more accessible and accurate together!

🌟 Happy Coding! 🌟

🚧 This repo is still incomplete and is under construction 🚧

AHR-OCR2024 / Arabic-Handwriting-Recognition

readme