kcobra / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

New grc training, including build files #1145

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I've spent a while now refining various things with the Ancient Greek training. 
The result is significantly better, not least due to the text2image tool, 
specifically its' --exposure setting. It's compressed with xz, which reduced it 
from 8.4MiB to 2.2MiB (xz is amazing).

Also attached is the complete build recipe (as a self-contained makefile) and 
source files (grc-src.tar.xz). I reckon the files should live in 
training/langdata/grc/, though I know Ray has some plans for how the training 
data should be organised in the future. This is how I imagine things being 
organised, anyway. Some of the files in grc-src.tar.xz are themselves 
generated, using the tools in the git repo at 
http://ancientgreekocr.org/grctraining.git, but the grc-src.tar.xz files are 
appropriately modifiable and self-contained that I think it makes sense to host 
them with the other training data.

Original issue reported on code.google.com by nick.wh...@durham.ac.uk on 26 Apr 2014 at 9:44

Attachments:

GoogleCodeExporter commented 9 years ago
Also the new website for it should be mentioned in the download page; 
http://ancientgreekocr.org

Original comment by nick.wh...@durham.ac.uk on 5 May 2014 at 7:34