HarshUpadhyay / TesseractTrainer

A small framework taking over the manual training process described in the Tesseract3 Wiki: https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Other
130 stars 37 forks source link

Shapeclustering and mftraining #11

Open rodcaverly opened 10 years ago

rodcaverly commented 10 years ago

I am running Tesseract 3.02 and trying to train with a new font. Having problems doing that, I downloaded and attempted to train Tesseract with the stardard TIF / BOX pairs. I did convert the .G4. versions of the TIF to uncompressed. Each subset of fonts I have attempted gets this error...

C:\Tesseract-OCR\tessdata>..\shapeclustering -F font_properties -U unicharset -O unicharset eng.arial.tr eng.arialbd.tr eng.arialbi.tr eng.ariali.tr Reading eng.arial.tr ... Reading eng.arialbd.tr ... Reading eng.arialbi.tr ... Reading eng.ariali.tr ... Font id = -1/0, class id = 1/108 on sample 0 font_id >= 0 && font_id < font_idmap.SparseSize():Error:Assert failed:in file ....\classify\trainingsampleset.cpp, line 622

Any assistance please.

nagexiucai commented 7 years ago

How is it going now?