This software implements OCR system using CNN + RNN + CTCLoss, inspired by CRNN network.
python ./train.py --help
Train simple OCR using TestDataset data generator. Training for ~60-100 epochs.
python train.py --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots>
Run test for trained model with visualization mode.
python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True
Structure of dataset:
<root_dataset_dir>
---- data
-------- <img_filename_0>
...
-------- <img_filename_1>
---- desc.json
Structure of desc.json:
{
"abc": <symbols_in_aphabet>,
"train": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
],
"test": [
{
"text": <text_on_image>
"name": <img_filename>
},
...
{
"text": <text_on_image>
"name": <img_filename>
}
]
}
Train simple OCR using custom dataset.
python train.pt --test-init True --test-epoch 10 --output-dir <path_to_folder_with_snapshots> --data-path <path_to_custom_dataset>
Run test for trained model with visualization mode.
python test.py --snapshot <path_to_folder_with_snapshots>/crnn_resnet18_10_best --visualize True --data-path <path_to_custom_dataset>