LAHTeR / htr-quality-classifier

Detect quality of (digitized) text.
GNU General Public License v3.0
3 stars 0 forks source link

Language Detection #9

Closed carschno closed 1 year ago

carschno commented 1 year ago

Add a language detection mechanism. If the language of a document is not as expected (e.g. Dutch), do not run the machine-learning based, language-specific quality classification.

kintopp commented 1 year ago

Perhaps of interest: https://www.cenl.org/wp-content/uploads/2023/04/CENL_AI_Yves_Maurer.pdf