Closed GoogleCodeExporter closed 9 years ago
sample replace.py which supports utf8 also attached.
Original comment by withbles...@gmail.com
on 3 Oct 2010 at 3:09
Attachments:
Hello Zdenko,
What does it take to add a new language script to the pytesseract trainer?
I don't see it as a utf-8 issue, as utf-8 is fully supported at both places
(trainer as well as tesseract-ocr). Why is the Kannada fonts are being
displayed? If you could provide some pointers, I would be able to fix it.
Thanks,
Senthil
Original comment by orsenthil@gmail.com
on 5 Oct 2010 at 2:51
I am not sure where exactly is problem (GTK? python? windows?
pytesseracttrainer?). According one post on tesseract forum
(http://groups.google.com/group/tesseract-ocr/msg/2846c4309d864c68?hl=en)
pytesseracttrainer works with japanese, so I expect that problem is not in
pytesseracttrainer ;-)
But I have not possibility to test Kanada or WindowsIME... So if somebody can
test it (identify problem) and improved code I would be glad.
My intention is to create font selector + "configuration system" - maybe it
helps in some extent. Now you can change font manually in script (line 53):
BASE_FONT = 'Serif' - as far as I tested it: choosing something else (e.g.
'Arial') than 'Sans', 'Serif', 'monospace' caused error message.
Original comment by zde...@gmail.com
on 5 Oct 2010 at 6:42
Regarding WindowsIME = kindly quote the website from where I can download
windowsIME.
regarding BASE_FONT = Tested with Kannada fonts viz. Kedage, Mallige, Tunga it
will display script clearly but
unable to type. Also tested with BRH Kannada font - does not display kannada
script nor unable to type, if selected Kannada - whereas if selected English,
easily can be typed using BarahaIME. This proves that BarahaIME does not
support py program in full?
Interesting point is tested in ubuntu 10.04 wherein it works fine with itrans
keyboard.why WinXP gives trouble - which I could not understand.
-sriranga(78yrsold)
Original comment by withbles...@gmail.com
on 5 Oct 2010 at 7:38
Even tested with google transliteration
IME(http://www.google.com/transliterate/)
does not work on py program in WinxP OS.This is brought to your kind notice.
Original comment by withbles...@gmail.com
on 30 Oct 2010 at 12:42
You should use appropriate font - change line 53 (BASE_FONT = 'Serif') in
pyTesseractTrainer-1.02.py to font that support your script.
E.g. when I changed it to BASE_FONT = 'unifont 12' I got attached result, that
looks reasonable to me. (of course unifont must be installed on you system).
Original comment by zde...@gmail.com
on 4 Jan 2011 at 10:25
Attachments:
Original comment by zde...@gmail.com
on 4 Jan 2011 at 10:35
[deleted comment]
[deleted comment]
ZDE@
Extremely thankful to you for the solution. I changed seriff to cheluvi font in
place of seriff as suggested by you. IT displayed the Kannada fonts but unable
to type in the box. cheluvi-n-ttf also attached for research purpose and for
benefit of other users
Original comment by withbles...@gmail.com
on 4 Jan 2011 at 12:15
Attachments:
Issue 8 has been merged into this issue.
Original comment by zde...@gmail.com
on 5 Jan 2011 at 7:51
is it can be used in Ubuntu11.01 O.S
Original comment by mamata2...@gmail.com
on 24 Apr 2013 at 5:23
Original issue reported on code.google.com by
withbles...@gmail.com
on 7 Sep 2010 at 3:32Attachments: