Closed HansonXia closed 4 years ago
Hi there,
Thanks for your comment. When dealing with grayscale images, of course we usually only have two dimensions, e.g. (H,W). However, since this package is meant to be used with PyTorch CNNs, we still need a channel dimension-- for example, you can use a torch.nn.Conv2d
layer on a image of shape (1, 1, 224, 224) (Batch x Channels x Height x Width). However, if you try to use a Conv2d on a grayscale image of shape (1, 224, 224) (Batch x Height x Width), you get this error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight ..., but got 3-dimensional input of size [1, 224, 224] instead
. So, I'm going to keep mandating that grayscale images have a bogus channel dimension.
If the input image is grayscale, I think you should use len(image.shape) == 2 instead of image.shape[2] == 1