Open ghost opened 4 years ago
Hi,
Because the images are resized according to the imgW = max(imgH * self.min_ratio, imgW)
in one batch.
So, if your images are sorted according to the text length, the images having the same width will be in the same batch.
Great. you are so smart!!! But if we set shuffle=True in train_loader. Then what you mentioned will not work? Thank you so much.
So we can set shuffle=False
Setting shuffle=False
in the data-loader will cause batches to have same set of samples in every epoch right. I think it might cause over fitting. Ideally we must shuffle data every epoch. But if you shuffle, sorting by length makes no sense as well :)
based on the above code, even we set keep_ratio = True, we cannot make the original aspect ratio of each image unchanged. Actually only aspect ratio of image with max w/h ratio is unchanged.
so generally how to keep ratio unchanged for all the images? by zero padding?
I'm very curious about the statement in https://github.com/meijieru/crnn.pytorch see below: