Improve recognition accuracy for long text lines

Text recognition preprocessing currently resizes all input lines to be 64px high, and scales the width proportionally, but constrained to a maximum of 800px. The 800px max-width was a limit used during training to limit the max memory usage of batches.

Using the new --text-line-images option to save preprocessed text line images to lines/, it becomes apparent that the max width limit can end up squashing text too much, causing some characters or spaces in long lines to be missed or for letters to be misidentified.

Although the recognition model was trained with a max input width of 800px, it generalizes to longer sequence lengths, so we can actually use wider images at inference time. From a quick test it looks like doing so fixes accuracy errors with the image below.

Input image:

This is a screenshot from a feature Wikipedia article:

Example preprocessed line:

line-0

Recognition output:

With default 800px limit:

The Benty Grange hanging bowlis a fragmentary Anglo-Saxon artitact trom the seventh century AD. Al
thaf remains are parts of twn escutcheons: bronze frames hat are usually circular and elaborately

The size of the line in the source image is ~1310px x ~30px, so this gets squashed horizontally.

With 1600px limit (set here):

The Benty Grange hanging bowl is a fragmentary Anglo-Saxon artifact from the seventh century AD, All
that remains are parts of two escutcheons: bronze frames that are usually circular and elaborately

robertknight / ocrs

Improve recognition accuracy for long text lines #31