Breta01 / handwriting-ocr

OCR software for recognition of handwritten text
MIT License
753 stars 240 forks source link

WordClassifier-Seq2Seq, WordClassifier-Seq2SeqX #60

Open souaissa opened 6 years ago

souaissa commented 6 years ago

Hi,

what is the role of WordClassifier-Seq2Seq and WordClassifier-Seq2SeqX and how train it

Breta01 commented 6 years ago

They are different approaches to the Word classification (using seq2seq model). Seq2seqX is my own extended version of this model. But right now the CTC model seems to have better performance. For training of this models you will need data in same format as in folder data/words2

souaissa commented 6 years ago

thanks Breta,

how to create data IAM words + file .txt

how to improve the image segmentation in words

Breta01 commented 6 years ago

The script in the scritps folder should pre-process words from IAM dataset. For taining of CTC model you don't need .txt files. These files are used only in Seq2SeqX and Gap Classifiers.