How can i assign a token for all the character that are not in opt.character

clovaai / deep-text-recognition-benchmark

Text recognition (optical character recognition) with deep learning methods, ICCV 2019

Apache License 2.0

3.77k stars 1.11k forks source link

Hello comunity. I am working in a project to recognize just numbers

My recognizer is trained with "0123456789" characters Moreover, I created a pipeline with a text detector + deep-text-recognition-benchmark-model to do OCR task.

When the text has just numbers it works ok, but when the text detector capture non numeric information my deep-text-recognition-benchmark-model assign numbers to the letters (has sense).

For instance, in the next text captured "H23", i would like to get something like "[UNK]23" or "_23" instead of "723"

Could you give me a suggestion?

clovaai / deep-text-recognition-benchmark

How can i assign a token for all the character that are not in opt.character #282