akorentlab / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

shapeclustering error in both Windows xp and Ubuntu 12.10 #795

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
When I do the "shapeclustering" step, the shapeclustering load the tr files (I 
was trying to train Chinese which has 8 tr files, and each tr file represent a 
different font, the total size of the 8 tr files is about 400M).And then it 
runs to the "0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16..." section. In the 
windows xp, when the number count to 1356, it turns out a error message. In the 
Ubuntu 12.10, the error message will turns out when the number counts to 3560. 

What is the expected output? What do you see instead?
It should have generate the "shapetable" file, but when the "shapeclustering" 
run 1 or 2 hours, there is a error showed in the CMD window:

in windows xp environment:

this application has request the runtime to terminate it an unusual way. please 
contact the application's support team for more information

in ubuntu 12.10:

terminate called after throwing an instance of 'std::bad_alloc' what(): 
std::bad_alloc

What version of the product are you using? On what operating system?
I use tessract 3.02.02 with both windows xp and Ubuntu 12.10

Please provide any additional information below.
If I only do "shapeclustering" with 2 tr files, it works. I've tried to enlarge 
my virtual memories with both windows and ubuntu, these error still occurs. Is 
that means my tr files are too large to be shapeclusteringed?

Original issue reported on code.google.com by chengyu0...@gmail.com on 16 Nov 2012 at 6:08

GoogleCodeExporter commented 9 years ago
I have the same scenario ، with these differences:
 Win7 64bit -- I have 4 tr files and the total size is more than 1 GB , and the error turns out when it counts to 394.
I am trying to train some Farsi fonts(Arabic) to tesseract.

shapeclustering was working for about 14 hours before this crash.

Original comment by abidiash...@gmail.com on 21 Dec 2012 at 8:03

GoogleCodeExporter commented 9 years ago
Same error, training with 4 Japanese files, total 270 Mb. No errors during 
previous stages. Error appears after 40 mins at index 2654.
Running 32 bit Windows 7 with 4 gb RAM. I noticed that the memory is getting 
low before  the error appeared.

Original comment by niko...@bai.co.jp on 25 Apr 2013 at 4:31