paalberti / tesseract-dan-fraktur

Tesseract ocr training data for Danish written in fraktur script and a few other languages
Other
17 stars 9 forks source link

sometimes crash on windows #1

Open janbrus opened 10 years ago

janbrus commented 10 years ago

Hello

I have used your dan-frak.traineddata on a windows 7 machine, and the results have been much better than the original dan-frak.traineddata, So this is very useful for Statstics Norway's historical statisics project. http://www.ssb.no/a/histstat/publikasjoner/

But sometimes (ca. 3% of the tif's), it crashes tesseract on my machine.

You you are interested you can try these pages. http://www.ssb.no/a/histstat/div/fraktur/side_004.tif http://www.ssb.no/a/histstat/div/fraktur/amt_side_91.tif

I have not tried this on a linux-machine.

Vennlig hilsen Jan Bruusgaard

paalberti commented 10 years ago

First of all, I am sorry for being so terribly slow at replying. And I appreciate hearing about you finding a use for my work.

I have tried to reproduce your crashes, but both files run ok for me (on a linux box, I don't currently have easy access to a windows box). Which version of tesseract are you using? The developers have fixed a number of bugs since version 3.02, so if you aren't already running the latest version from svn I will suggest you try that. That might be a bit tricky as you probably have to compile the source code yourself.

I wish I had a more useful answer for you, but tracking down crashes is a bit over my head.