bityigoss / mtl-text-recognition

multi-task learning for text recognition with joint CTC-attention
115 stars 36 forks source link
chinese-ocr chinese-text-recognition crnn ctc-attention ocr ocr-engine ocr-recognition scene-text-recognition text-recognition

mtl-text-recognition

multi-task learning for text recognition with joint ctc-attention.

Update features

Training and evaluation

  1. Train CRNN model
    CUDA_VISIBLE_DEVICES=0 python train.py \
    --train_data data/synch/lmdb_train \
    --valid_data data/synch/lmdb_val \
    --select_data / --batch_ratio 1 \
    --sensitive \
    --num_iter 400000 \
    --output_channel 512 \
    --hidden_size 256 \
    --Transformation None \
    --FeatureExtraction ResNet \
    --SequenceModeling BiLSTM \
    --Prediction CTC \
    --experiment_name none_resnet_bilstm_ctc \
    --continue_model saved_models/pretrained_model.pth
  2. Train CTC-Attention model
    CUDA_VISIBLE_DEVICES=0 python train.py \
    --train_data data/synch/lmdb_train \
    --valid_data data/synch/lmdb_val \
    --select_data / --batch_ratio 1 \
    --sensitive \
    --num_iter 400000 \
    --output_channel 512 \
    --hidden_size 256 \
    --Transformation None \
    --FeatureExtraction ResNet \
    --SequenceModeling BiLSTM \
    --Prediction CTC \
    --mtl \
    --without_prediction \
    --experiment_name none_resnet_bilstm_ctc \
    --continue_model saved_models/pretrained_model.pth

Acknowledgements

  1. This implementation has mainly been based on this great repository: deep-text-recognition-benchmark
  2. SynthText Generation has mainly been based on TextRecognitionDataGenerator