klukonin / lector

Automatically exported from code.google.com/p/lector
GNU General Public License v2.0
0 stars 0 forks source link

setting automatically the location of the languages #16

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
futures:
1. use the default language if exist
2. propose the download if no exist
3. must be a menu to chose the possbility of download

Original issue reported on code.google.com by chopinX04@gmail.com on 21 Nov 2008 at 12:55

GoogleCodeExporter commented 8 years ago
Moreover, the available tesseract languages should be autodetected. On startup,
Lector will check for required files and show all installed languages in the 
left
panel switch.

These files are stored in /usr/share/tesseract/tessdata/ directory and are 8 
for each
language (???.DangAmbigs  ???.inttemp    ???.pffmtable   ???.user-words 
???.freq-dawg
  ???.normproto  ???.unicharset  ???.word-dawg), where the ??? is the lang code from
[1] . Also, there were requests for detection of digits 0-9 only.

I include a file extracted from [1], containing languages in the format
  cze     Czech      Čeština
  deu     German     Deutsch
and two additional files containing the code along with only original or 
english name.
____
[1]: http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes

Original comment by filip.do...@gmail.com on 10 Feb 2009 at 9:25

Attachments: