I've been reading the dup_boxes_synth_text.py script from your data conversions scripts repository. If Im not mistaken this is what one has to use to convert the SynthText dataset for training.
Anyways, on lines 59 through 62 there are three files namely:
imnames.np.npy, wordBB.np.npy and gt_txt.npz.
My question is how should I generate these files?
do I have to modify the gen.py script from SynthText github repository to generate them or they are created from the gt.mat file downloaded from the pregenerated SynthText dataset with 800000 images linked in SynthText github repository?
if yes could you tell me the format of the data within these files or point to / provide a script to do this?
Good Day
I've been reading the
dup_boxes_synth_text.py
script from your data conversions scripts repository. If Im not mistaken this is what one has to use to convert the SynthText dataset for training.Anyways, on lines 59 through 62 there are three files namely:
imnames.np.npy
,wordBB.np.npy
andgt_txt.npz
.My question is how should I generate these files?
do I have to modify the
gen.py
script from SynthText github repository to generate them or they are created from thegt.mat
file downloaded from the pregenerated SynthText dataset with 800000 images linked in SynthText github repository?if yes could you tell me the format of the data within these files or point to / provide a script to do this?
your help is greatly appreciated