Open gasyoun opened 3 years ago
1) devanagari lost to Лha
2) italics lost
3) hyphen at end of line lost
Лha гуру-тва, ср. (-твам), 1) тяжесть, — вѣсъ, важность, — достоинство, уваженіе внушаемое лѣтами и нравственнымъ достоинствомъ, — достоинство наставника; — 2) тя гости, горе.
Since 2007 I'm submitting errors to https://www.sanskrit-lexicon.uni-koeln.de/ - main source of Sanskrit dictionaries on the net. In 2014 I launched https://github.com/sanskrit-lexicon/ to make error submission public. We add new dictionaries. Now I want to add a few Sanskrit-Russian dictionaries, but they use inermixed languages and Google OCR fails in that even more than Abbyy Fine Reader 12 (published in 2013, all other later versions have even weaker algorithms and a higher level of dirt in output). In 2013 (see http://samskrtam.ru/hellwigs-devanagari-ocr/) I wrote why
Hellwig’s Devanagari OCR
failed for batch recognition of Sanskrit OCR (1.0.0.9 beta).Knauer's whole dictionary can be seen at http://samskrtam.ru/sanskrit-lexicon/knauer/ The original book was scanned with 600 dpi, the print is clear. Still the output is worse than 7-10 years ago with desktop software (where I was able to teach and edit patterns myself).
Output:
Issues: