Language detection from images is relatively difficult. Adobe and ABBYY OCR require you already know the language of the document before you start OCR.
REQUEST
Please use your document generator to generate documents in different languages.
Ideally, you would even mix different languages.
Release a pretrained model that estimates the percentage of each language in a particular document image.
MOTIVATION
Language detection from images is relatively difficult. Adobe and ABBYY OCR require you already know the language of the document before you start OCR.
REQUEST