Holmeyoung / crnn-pytorch

Pytorch implementation of CRNN (CNN + RNN + CTCLoss) for all language OCR.
MIT License
378 stars 105 forks source link

issue of keep ratio = True #52

Open ghost opened 4 years ago

ghost commented 4 years ago

class alignCollate(object):

    def __init__(self, imgH=32, imgW=100, keep_ratio=False, min_ratio=1):
        self.imgH = imgH
        self.imgW = imgW
        self.keep_ratio = keep_ratio
        self.min_ratio = min_ratio

    def __call__(self, batch):
        images, labels = zip(*batch)

        imgH = self.imgH
        imgW = self.imgW
        if self.keep_ratio:
            ratios = []
            for image in images:
                w, h = image.size
                ratios.append(w / float(h))
            ratios.sort()
            max_ratio = ratios[-1]
            imgW = int(np.floor(max_ratio * imgH))
            imgW = max(imgH * self.min_ratio, imgW)  # assure imgH >= imgW

        transform = resizeNormalize((imgW, imgH))
        images = [transform(image) for image in images]
        images = torch.cat([t.unsqueeze(0) for t in images], 0)

        return images, labels

based on the above code, even we set keep_ratio = True, we cannot make the original aspect ratio of each image unchanged. Actually only aspect ratio of image with max w/h ratio is unchanged.

so generally how to keep ratio unchanged for all the images? by zero padding?

I'm very curious about the statement in https://github.com/meijieru/crnn.pytorch see below:

Construct dataset following origin guide. If you want to train with variable length images (keep the origin ratio for example), please modify the tool/create_dataset.py and sort the image according to the text length.

Holmeyoung commented 4 years ago

Hi, Because the images are resized according to the imgW = max(imgH * self.min_ratio, imgW) in one batch.

So, if your images are sorted according to the text length, the images having the same width will be in the same batch.

ghost commented 4 years ago

Great. you are so smart!!! But if we set shuffle=True in train_loader. Then what you mentioned will not work? Thank you so much.

Holmeyoung commented 4 years ago

So we can set shuffle=False

mineshmathew commented 3 years ago

Setting shuffle=False in the data-loader will cause batches to have same set of samples in every epoch right. I think it might cause over fitting. Ideally we must shuffle data every epoch. But if you shuffle, sorting by length makes no sense as well :)