Open vaibhavjussspacetech opened 4 years ago
Hello,
Sorry for the late reply.
I just updated our README and added the structure of data folder as below.
data
├── gt.txt
└── test
├── word_1.png
├── word_2.png
├── word_3.png
└── ...
You can generate the datasets with your fonts by using text generation engines such as MJSynth and SynthText.
Hope it helps.
Best.
Hi, I am new in the field of text recognition. I go through "When you need to train on your own dataset or Non-Latin language datasets." post in which for the generation of new data is by calling a "create_lmdb_dataset.py" file by supplying two inputs, path for data and path for ground truth. But I can't understand what this data folder contains? is it a set of natural images containing a text or it's simple "empty" folder. and on what basis I can generate ground truth?
I need to create a database containing digit along with character with a format like "Arial", "MICR" etc.
I have .ttf file for all fonts with me by using those fonts I would like to generate the dataset which will further use for transfer learning.
Please guide me. Thanks in advance.