dlareklami / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Enhancement - Support for Sanskrit Language - Devanagari Script #892

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I would like to request an enhancement to Tesseract-OCR and request for support 
for Sanskrit language which is written (mostly) in devanagari script  (also 
used by Hindi, Marathi).

Since cube training info is not available, I have generated a proof-of-concept 
san.traineddata with samples in sanskrit2003 font. 

Original issue reported on code.google.com by shreeshrii on 13 Apr 2013 at 5:04

GoogleCodeExporter commented 9 years ago
I am not able to attach files - Issue attachment storage quota exceeded.

Please look for link to sanskrit2003.zip on 
https://sourceforge.net/p/vietocr/feature-requests/6/?page=5

A special thanks to Quan Nguyen for making modifications to JTessBoxEditor to 
make is easier to generate training data for Hindi/Sanskrit.

Original comment by shreeshrii on 13 Apr 2013 at 6:10