raffaeldantas / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
1 stars 0 forks source link

xheights file #1384

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. add the name of a font containing latin characters as well as devanagari 
characters to Latin.xheights and Devanagari.xheights file in langdata
2. give different xheights for Latin and Devanagari for same font
3. Run tesstrain.sh to do training

What is the expected output? What do you see instead?
LANG.xheights file (or san.xheights for sanksrit) has all entries from 
Latin.xheights followed by all entries from Devanagari.xheights, leading to 
duplicate entries for fonts which have both ranges.

What version of the product are you using? On what operating system?
latest version from git, msys2 on windows8

Please provide any additional information below.

Original issue reported on code.google.com by shreeshrii on 18 Nov 2014 at 8:55