Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition
MIT License
3.22k stars 965 forks source link

Format of labels .txt is not as expected by easy OCR #262

Open pydev2018 opened 2 years ago

pydev2018 commented 2 years ago

I have successfully generated both the images and the labels , but one issue is that the output format is not what EasyOCR trainer expects , which is csv format

filename,words 2.jpg,Tel 4.jpg,Lindsey_Coge Vaqueros nuns 3.jpg,Jinglei @ Extensive 13.jpg,~Bombed Discourse 1.jpg,whishes Jews Lunenburg $ 6.jpg,Enough Wiener 16.jpg,129 Reykjavik Ch

Is there any way to generate the labels in the same format as here

Belval commented 2 years ago

You can use --name_format 2 but that will give you spaces instead of csv.

You could just make a code change here: https://github.com/Belval/TextRecognitionDataGenerator/blob/master/trdg/run.py#L473