georgeretsi / defHTR

Deformation-invariant line-level Handwritten Text Recognition (HTR) using a convolutional-only architecture.
MIT License
7 stars 2 forks source link
cnn-pytorch deformable-convolutional-networks deformation-invariant handwritten-text-recognition htr iam-dataset pytorch

defHTR

Pytorch code for deformation-invariant line-level Handwritten Text Recognition, as proposed in paper (accepted to ICIP'21).

Motivation: Image deformations under simple geometric restrictions are crucial for Handwriting Text Recognition (HTR), since different writing styles can be viewed as simple geometrical deformations of the same textual elements.
Contibutions: 1) Exploration of different existing strategies for ensuring deformation invariance, including spatial transformers and deformable convolutions, under the context of text recognition. 2) Introduction of a new deformation-based algorithm, inspired by adversarial learning, which aims to reduce character output uncertainty during evaluation time.

DNN Architecture: A Convolutional-only HTR system is presented (see paper), where the output of a convolutional backbone, which transforms the images into a sequence of feature vectors, is fed into a cascade of 1-D convolutional layers. Model architecture can be modified by changing the the cnn_cfg and cnn_top variables in config.py. Specifically, CNN backbone is consisted of multiple stacks of ResBlocks and the default setting cnn_cfg = [(2, 32), 'M', (4, 64), 'M', (6, 128), 'M', (2, 256)] is interpeted as follows: the first stack consists of 2 resblocks with output channels of 32 dimensions, the second of 4 resblocks with 64 output channels etc. The head, consisted of three 1-D convolutional layers, can be modified through the cnn_top variable, which controls the number of output channels in these layers.

Selected Features:

File valid_deforms.py contains:

Installation:

python3 -m venv venv_defhtr         # Create a virtual environment.. (optional)
source venv_defhtr/bin/activate     # ..and activate the virtual environment (optional)
pip3 install --upgrade pip          # Upgrade pip
pip3 install -r requirements.txt    # Install required libraries
cat utils/iam_config.py             # Update the folder where IAM resides in your hard drive
python3 train_htr.py                # Have fun!!

Note: Local paths of IAM dataset (https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) are hardcoded in utils/iam_config.py