patcharats / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Error: 31 classes in inttemp while unicharset contains 32 unichars. #55

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Using phototest.tif installed by tess2.0, re-generated 8 data files with
prefixed "eng.xxx"
2. Replaced above re-generated 8 data files with original data 8 files
installed by tess2.0
2. Executed "tesseract phototest.tif output -l eng"
3. Instead of output.txt, tesseract.log generated as "Error: 31 classes in
inttemp while unicharset contains 32 unichars."
4. ouput.txt failed/not generated.

What is the expected output? What do you see instead?
output of re-generated data 8 files (eng.xxx) should be identical with the
output of original data 8 files - instead  generated log error as above Why?

What version of the product are you using? On what operating system?
Tesseract2.0   MSwindows

Please provide any additional information below.
It appears there is some problem with soure code  -which have 
re-investigated.

Original issue reported on code.google.com by withbles...@gmail.com on 8 Aug 2007 at 4:46

GoogleCodeExporter commented 9 years ago
forgot to attach files - now attached

Original comment by withbles...@gmail.com on 8 Aug 2007 at 5:21

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by withbles...@gmail.com on 8 Aug 2007 at 5:26

Attachments:

GoogleCodeExporter commented 9 years ago
solution how to rectify is requested.

Original comment by withbles...@gmail.com on 10 Aug 2007 at 8:33

GoogleCodeExporter commented 9 years ago
If you want further help with this please post your box corresponding tiff file 
as
that is where the problem lies.

Original comment by theraysm...@gmail.com on 17 Aug 2007 at 4:16

GoogleCodeExporter commented 9 years ago
Yes I want further help. attached phototest.box

Original comment by withbles...@gmail.com on 22 Aug 2007 at 6:03

Attachments:

GoogleCodeExporter commented 9 years ago
Above issue relate to phototest.tif file.

Today I checked with phototest.bmp (by coping phototest.tif to paintbrush and 
saved
as "24bit bmp"). As a test run "tesseract phototest.bmp output"(using tess 
default 
eng-8data files)and the output was correct,as normal.

After training and generated 8data files  as usual, prefixed all generated all
datafiles as "new. xxx" and then run "tesseract phototest.bmp output -l new". 
Instead
of output text, log error generated as "Error: 31 classes in inttemp while 
unicharset
contains 32 unichars."-which is similar to above post. Relevant files also 
attached

Original comment by withbles...@gmail.com on 29 Aug 2007 at 9:49

Attachments:

GoogleCodeExporter commented 9 years ago
Several problems related to this fixed in 2.01

Original comment by theraysm...@gmail.com on 30 Aug 2007 at 7:57