jbohnslav / opencv_transforms

OpenCV implementation of Torchvision's image augmentations
MIT License
375 stars 46 forks source link

Gray image error #6

Closed HansonXia closed 4 years ago

HansonXia commented 5 years ago

If the input image is grayscale, I think you should use len(image.shape) == 2 instead of image.shape[2] == 1

jbohnslav commented 4 years ago

Hi there,

Thanks for your comment. When dealing with grayscale images, of course we usually only have two dimensions, e.g. (H,W). However, since this package is meant to be used with PyTorch CNNs, we still need a channel dimension-- for example, you can use a torch.nn.Conv2d layer on a image of shape (1, 1, 224, 224) (Batch x Channels x Height x Width). However, if you try to use a Conv2d on a grayscale image of shape (1, 224, 224) (Batch x Height x Width), you get this error:

RuntimeError: Expected 4-dimensional input for 4-dimensional weight ..., but got 3-dimensional input of size [1, 224, 224] instead. So, I'm going to keep mandating that grayscale images have a bogus channel dimension.