Closed akmalkadi closed 3 years ago
No. To both questions. You are training on line images. The transition probabilities are estimated on the image level. It is neither useful nor necessary to try to somehow modify the prior in your training data. Although you have to ensure that you have each character you later want to recognize in your training data (with sufficient frequency).
Greetings,
There are many tokens redundant in a dataset (pair of text lines and image of the lines). Sometimes a word can appear in +10k lines. Do I need only one appearance for each token? Will a token with +10k appearing, will have more priority in the recognition?