Closed NeilduToit13 closed 2 years ago
Don´t know man, i think in this case the only way to obtain this answer is by performing the full training followed by tests.
Is it possible for my line images and single-line plain text to consist of just 1 word? Will this impact training at all? Or do I have to use a line with several words for the ground truth? Thank you!
Sure, lines can be a single word or even a single character (like page number) but the less content is in the line, the less context available for the neural network to learn. Some very short lines don't hurt the training AFAICT but the overall training set should be representative of the data to detect later on.
The only thing really forbidden is ground truth text with newlines in them, i.e. multi-line "line" images.
performing the full training followed by tests
That is always best, obviously, and feel free to share your findings here as well if you do.
@NeilduToit13, I think your question was answered, so I close this issue.
Regarding: "Place ground truth consisting of line images and transcriptions in the folder data/MODEL_NAME-ground-truth." and "Transcriptions must be single-line plain text"
Is it possible for my line images and single-line plain text to consist of just 1 word? Will this impact training at all? Or do I have to use a line with several words for the ground truth? Thank you!