AmitGorvadiya / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

uable to locate box files -kannada #216

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Tesseract Open Source OCR Engine
Bad utf-8 char starting with 0xfffffff7 at line 1, col 189, 
Box file format error on line 189 ignored
APPLY_BOXES: boxfile 38/7/??? ((289,711),(309,746)): FAILURE! box overlaps
no blobs or blobs in multiple rows
APPLY_BOXES: boxfile 38/9/??? ((359,711),(392,753)): FAILURE! box overlaps
no blobs or blobs in multiple rows
APPLY_BOXES: boxfile 38/14/??? ((553,711),(573,746)): FAILURE! box overlaps
no blobs or blobs in multiple rows
APPLY_BOXES: boxfile 38/17/??? ((657,711),(677,746)): FAILURE! box overlaps
no blobs or blobs in multiple rows
APPLY_BOXES: boxfile 38/19/??? ((726,711),(746,746)): FAILURE! box overlaps
no blobs or blobs in multiple rows
APPLY_BOXES: Unlabelled word blk:1 row:2 allrows:2
APPLY_BOXES: Unlabelled word blk:1 row:3 allrows:3
APPLY_BOXES: Unlabelled word blk:1 row:4 allrows:4
APPLY_BOXES: Unlabelled word blk:1 row:7 allrows:7
APPLY_BOXES: Unlabelled word blk:1 row:8 allrows:8
APPLY_BOXES: Unlabelled word blk:1 row:9 allrows:9
APPLY_BOXES: Unlabelled word blk:1 row:10 allrows:10
APPLY_BOXES: Unlabelled word blk:1 row:12 allrows:12
APPLY_BOXES: Unlabelled word blk:1 row:14 allrows:14
APPLY_BOXES: Unlabelled word blk:1 row:15 allrows:15
APPLY_BOXES: Unlabelled word blk:1 row:16 allrows:16
APPLY_BOXES: Unlabelled word blk:1 row:17 allrows:17
APPLY_BOXES: Unlabelled word blk:1 row:18 allrows:18
APPLY_BOXES: Unlabelled word blk:1 row:19 allrows:19
APPLY_BOXES: Unlabelled word blk:1 row:20 allrows:20
APPLY_BOXES: Unlabelled word blk:1 row:21 allrows:21
APPLY_BOXES: Unlabelled word blk:1 row:22 allrows:22
APPLY_BOXES: Unlabelled word blk:1 row:23 allrows:23
APPLY_BOXES: Unlabelled word blk:1 row:24 allrows:24
APPLY_BOXES: Unlabelled word blk:1 row:25 allrows:25
APPLY_BOXES: Unlabelled word blk:1 row:29 allrows:29
APPLY_BOXES: Unlabelled word blk:1 row:30 allrows:30
APPLY_BOXES: Unlabelled word blk:1 row:31 allrows:31
APPLY_BOXES: Unlabelled word blk:1 row:33 allrows:33
APPLY_BOXES: Unlabelled word blk:1 row:35 allrows:35
APPLY_BOXES: Unlabelled word blk:1 row:36 allrows:36
APPLY_BOXES: Unlabelled word blk:1 row:37 allrows:37
APPLY_BOXES: Unlabelled word blk:1 row:38 allrows:38
APPLY_BOXES: Unlabelled word blk:1 row:39 allrows:39
APPLY_BOXES: Unlabelled word blk:1 row:41 allrows:41
APPLY_BOXES: Unlabelled word blk:1 row:42 allrows:42
APPLY_BOXES: REBALANCE REQD "???" - target of 89 from 84 labelled samples
APPLY_BOXES:
   Boxes read from boxfile:     744
   Initially labelled blobs:    739 in 44 rows
   Box failures detected:            5
   Duped blobs for rebalance:     5
   "0" has fewest samples:     1
                Total unlabelled words:       31
                Final labelled words:        744
Generating training data
TRAINING ... Font name = UnknownFont.
Generated training data for 744 blobs

2.
3.

What is the expected output? What do you see instead?
without error generated .tr file.

What version of the product are you using? On what operating system?
tesseract 2,04 winXPwithsp3

Please provide any additional information below.
 uable to locate failure in box/txt image file even using "find". How to
rectify 

Original issue reported on code.google.com by withbles...@gmail.com on 6 Jul 2009 at 8:01

Attachments:

GoogleCodeExporter commented 9 years ago
In the present case difficult to locate even with help of "Find" tool.
whereas in other cases where failure are indicated log file are easily located 
with
the help of "find" tool and deleted such failures from the relevant box file. 
As such
this peculiar problem was posted above. 

Original comment by withbles...@gmail.com on 6 Jul 2009 at 8:22

GoogleCodeExporter commented 9 years ago
closed perhaps due to utf-8 coding problem

Original comment by withbles...@gmail.com on 26 Apr 2010 at 11:11

GoogleCodeExporter commented 9 years ago
closed perhaps due to utf-8 coding problem because of ????

Original comment by withbles...@gmail.com on 26 Apr 2010 at 11:12

GoogleCodeExporter commented 9 years ago
Will be fixed in 3.01.

Original comment by theraysm...@gmail.com on 19 May 2010 at 11:08

GoogleCodeExporter commented 9 years ago
Issue 217 has been merged into this issue.

Original comment by theraysm...@gmail.com on 19 May 2010 at 11:09

GoogleCodeExporter commented 9 years ago
Issue 276 has been merged into this issue.

Original comment by theraysm...@gmail.com on 20 May 2010 at 3:57

GoogleCodeExporter commented 9 years ago
Issue 315 has been merged into this issue.

Original comment by joregan on 29 May 2010 at 2:23

GoogleCodeExporter commented 9 years ago
 This is old issue pertains to 2.04 - which is discontinued the usage by me in view of release of latest version 3.02. Issue may  as closed since purpose will not served.

Original comment by withbles...@gmail.com on 19 Feb 2012 at 7:48

GoogleCodeExporter commented 9 years ago

Original comment by zde...@gmail.com on 19 Feb 2012 at 12:21