Closed AashishV closed 7 years ago
What software are you using to view 4 channels? Is this CMYK? You should be able to interpret the images as 3-channel (RGB).
from PIL import Image
import numpy as np
img_filename = '../../../Dataset/nlvr/train/images/1/train-1196-0-0.png'
img = Image.open(img_filename)
print(np.array(img).shape)
This gives an output as (100, 400, 4).
So, I am using the below snippet:
from PIL import Image
import numpy as np
img_filename = '../../../Dataset/nlvr/train/images/1/train-1196-0-0.png'
img = Image.open(img_filename).convert('RGB')
print(np.array(img).shape)
This gives me an output of (100, 400, 3) but I have to divide them by 255 to make the values lie in between 0 and 1.
I'm not very familiar with PIL, and in our code we use scipy imread (which calls PIL and gives us four channels -- we throw away the last channel). The last channel value is 255 for us, so I would guess this is some kind of alpha value. I'd suggest ignoring this channel because the values are all the same for each example. I'd also suggest dividing by 255. But you can investigate with PIL to see if it can do all of this by default for you. Does this answer your question?
Yes, this does answer my question. Thank you for the quick reply.
The given images have 4 colour channels and the last channel looks like this,
Is there any particular reason for this?