Generate text line images for training deep learning OCR model (e.g. CRNN).
lmdb
dataset which compatible with PaddleOCR, see DatasetRun following command to generate images using example data:
git clone https://github.com/oh-my-ocr/text_renderer
cd text_renderer
python3 setup.py develop
pip3 install -r docker/requirements.txt
python3 main.py \
--config example_data/example.py \
--dataset img \
--num_processes 2 \
--log_period 10
The data is generated in the example_data/output
directory. A labels.json
file contains all annotations in follow format:
{
"labels": {
"000000000": "test",
"000000001": "text2"
},
"sizes": {
"000000000": [
120,
32
],
"000000001": [
128,
32
]
},
"num-samples": 2
}
You can also use --dataset lmdb
to store image in lmdb file, lmdb file contains follow keys:
You can check config file example_data/example.py to learn how to use text_renderer, or follow the Quick Start to learn how to setup configuration
.ttf
、.otf
、.ttc
You can download pre-prepared file resources for this Quick Start
from here:
Save these resource files in the same directory:
workspace
├── bg
│ └── background.png
├── corpus
│ └── eng_text.txt
└── font
└── simsun.ttf
Create a config.py
file in workspace
directory. One configuration file must have a configs
variable, it's
a list of GeneratorCfg.
The complete configuration file is as follows:
import os
from pathlib import Path
from text_renderer.effect import *
from text_renderer.corpus import *
from text_renderer.config import (
RenderCfg,
NormPerspectiveTransformCfg,
GeneratorCfg,
SimpleTextColorCfg,
)
CURRENT_DIR = Path(os.path.abspath(os.path.dirname(__file__)))
def story_data():
return GeneratorCfg(
num_image=10,
save_dir=CURRENT_DIR / "output",
render_cfg=RenderCfg(
bg_dir=CURRENT_DIR / "bg",
height=32,
perspective_transform=NormPerspectiveTransformCfg(20, 20, 1.5),
corpus=WordCorpus(
WordCorpusCfg(
text_paths=[CURRENT_DIR / "corpus" / "eng_text.txt"],
font_dir=CURRENT_DIR / "font",
font_size=(20, 30),
num_word=(2, 3),
),
),
corpus_effects=Effects(Line(0.9, thickness=(2, 5))),
gray=False,
text_color_cfg=SimpleTextColorCfg(),
),
)
configs = [story_data()]
In the above configuration we have done the following things:
gray=False
, SimpleTextColorCfg()
font_size
, font_dir
Run main.py
, it only has 4 arguments:
img
or lmdb
Find all effect/layout config example at link
bg_and_text_mask
: Three images of the same width are merged together horizontally,
it can be used to train GAN model like EraseNetName | Example | |
---|---|---|
0 | bg_and_text_mask | |
1 | char_spacing_compact | |
2 | char_spacing_large | |
3 | color_image | |
4 | curve | |
5 | dropout_horizontal | |
6 | dropout_rand | |
7 | dropout_vertical | |
8 | emboss | |
9 | extra_text_line_layout | |
10 | line_bottom | |
11 | line_bottom_left | |
12 | line_bottom_right | |
13 | line_horizontal_middle | |
14 | line_left | |
15 | line_right | |
16 | line_top | |
17 | line_top_left | |
18 | line_top_right | |
19 | line_vertical_middle | |
20 | padding | |
21 | perspective_transform | |
22 | same_line_layout_different_font_size | |
23 | vertical_text |
Setup Commitizen for commit message
Build image
docker build -f docker/Dockerfile -t text_renderer .
Config file is provided by CONFIG
environment.
In example.py
file, data is generated in example_data/output
directory,
so we map this directory to the host.
docker run --rm \
-v `pwd`/example_data/docker_output/:/app/example_data/output \
--env CONFIG=/app/example_data/example.py \
--env DATASET=img \
--env NUM_PROCESSES=2 \
--env LOG_PERIOD=10 \
text_renderer
Start font viewer
streamlit run tools/font_viewer.py -- web /path/to/fonts_dir
cd docs
make html
open _build/html/index.html
If you use text_renderer in your research, please consider use the following BibTeX entry.
@misc{text_renderer,
author = {oh-my-ocr},
title = {text_renderer},
howpublished = {\url{https://github.com/oh-my-ocr/text_renderer}},
year = {2021}
}