Closed juliangilbey closed 4 years ago
@juliangilbey Thanks for the suggestions!
However, I would not say that the code is “misleading”: The documentation clearly states:
Transcriptions must be single-line plain text ...
There are suggestions for multi-line training by @Shreeshrii which are also mentioned in the README.
The generate_line_box.py
script DOES expect single-line GT though, so an error if there's more than one line in the file would not hurt IMHO.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Thanks stale bot! The issue still remains, though: just because the documentation says don't have a multiple-line ground truth doesn't mean that people will notice that line of the documentation or remember it. Generating an error message would be much more helpful.
@juliangilbey @kba Agreed.
Hi!
Because the generated box files are useless in the case of a multiline image, the current code is very misleading and breaks training if fed multiline ground truths/images. (I only just realised why my training is not working...)
My suggestion is that the code is modified in two ways:
(1) If len(lines) > 1, then exit with a suitably informative message.
(2) Remove the for loop, and replace "line.strip" with "line[0].strip" in the normalize line.
Best wishes,
Julian