Closed jbarth-ubhd closed 1 year ago
Can you please provide a valid example of the such case? IMHO empty txt is an error that should be fixed and not accepted (with a workaround).
For example this scan:
is recognized to
r
(including a space before & after r
).
I've corrected this first to
(three spaces, assuming the space before & after is mandatory).
So all images that do not really contain text should not be used for training. And no spaces before & after letters. (?)
Sometimes the test images do not contain text but horizontal dividers etc., so "no text", but then generate_line_box.py fails.
I'd suggest this: