Closed GoogleCodeExporter closed 9 years ago
[deleted comment]
please forward tif file with <mylang>.traineddata for testing in WinXP .Whether
tesseract.exe is of debug or release version?
-Withblessings@gmail.com
Original comment by withbles...@gmail.com
on 16 Jun 2010 at 7:35
Original comment by rasm...@gmail.com
on 17 Jun 2010 at 7:06
Attachments:
[deleted comment]
soory for the late reply.tesseract.exe is of debug version..I had excluded
unicharambigs and mal.config files from traineddata.
Original comment by rasm...@gmail.com
on 17 Jun 2010 at 7:15
Downloaded your tif and traineddata (all malalyalm lang.) Tested using release
version in winxp
It is observed that your tif is of 72 dpi. I increased to 300 dpi using
irfanview.
run as "tesseract mal.meera.01.tif test -l mal, The output "test.txt" appears
to be is in order. I don't know malayalam but Kannada only. test.txt is
uploaded.
It appears tesseract(debug version) has some problem when I run following your
method - displayed exe encounter windows message (no error displayed in the CMD.
please test it with release version and also debug version. If release version
is OK, then debug version has problem.
Original comment by withbles...@gmail.com
on 17 Jun 2010 at 8:27
Attachments:
The assertion that's failing here basically checks that the unicharset has been
loaded - that's not happening here. I'll look into it, but I suspect that this
is because of some change in the unicharset format.
Original comment by joregan
on 17 Jun 2010 at 11:31
Can you test it on recent svn revision? It works for me on r521 with files you
provided
Original comment by zde...@gmail.com
on 17 Nov 2010 at 8:27
@zde,
Downloaded svn r-525 in ubuntu and then transferred to winxp folder, since I do
not know to how to checkout svn in Winxp.
1)As desired, I checked with debug version of tesseract.exe - tested with
phototest.tif, mal.meera.01.tif and also kan1.tif and all outputs were found
to be in order, clear and OK. no problem is faced by me.
2)also tested with release version of tesseract.exe. tested with phtotest.tif,
kan1.tif - all output files were clear and OK no problem. but,for mal.tif
failed to generate output with windows encounter message.
3) mal.meera.01.tif was tested in release version -but failed with windows
encounter message - vide screenshot attached.
since mal.tif was of 72 dpi -increased to 200,300 using irfanview and saved as
tif
file(uncompressed) but still generates windows encounter message . I could not
understand why it happens for mal.tif only - whereas other tif files of other
lang viz phototest.tif, kan.tif works fine without any error message displayed.
With regards,
-sriranga(78yrsold)
Original comment by withbles...@gmail.com
on 18 Nov 2010 at 2:11
Attachments:
Can you please try re-training with 3.01 release and than also current svn
revision (there is (at least) one more step: shapeclustering see example
http://code.google.com/p/tesseract-ocr/issues/detail?id=430#c7)?
Original comment by zde...@gmail.com
on 22 Feb 2012 at 9:14
As desired by you - I checked with version 3.01 - attached files which are
self explanatory.
Feedback regarding= 3.02(r-679) Kindly view in the next email.
font_ properties file gives lot of trouble. as such delay to feedback to you.
box/tif file was generated in jboxeditor tool based on test.txt file attached
under the issue 321.
Original comment by withbles...@gmail.com
on 25 Feb 2012 at 2:50
Attachments:
Zdenko,
As desired by you i also checked under new version 3.02(utpto r-679) - vide
attached files which are self explanatory. Any more information is required?.
for your information, I dont know malayalam script. when compared with image
file and output file it appears to be correct except one image[26/മ] .(vide
tesseract.log file attached. similar log was generated for version 3.01 also).
Even edited the box file in owler no effect.- How to edit the same may kindly
be guided for future.
With warmest Regards,
-sriranga(79yrs)
Original comment by withbles...@gmail.com
on 25 Feb 2012 at 2:56
Attachments:
@withblessings: report is about error is at dawg.cpp. As far as I see you did
not bother with dawg... So this not about box editing, this is about dictionary
creation. I would prefer somebody who know Malayalam script can test it.
Original comment by zde...@gmail.com
on 25 Feb 2012 at 7:23
you might check this, for me, in Bengali, it worked.
http://www.sk-spell.sk.cx/tesseract-ocr-en-can-i-use-my-data-for-204
Original comment by sagnik1...@gmail.com
on 23 May 2012 at 2:08
closing based on data in Comment 12
Original comment by zde...@gmail.com
on 24 Jul 2012 at 6:13
Original issue reported on code.google.com by
rasm...@gmail.com
on 16 Jun 2010 at 6:00Attachments: