Open SoraDevin opened 5 years ago
Have you solved this problem yet? I have the same question? @SoraDevin
Unfortunately not. There are a bunch of other augmentations I'm able to work on using 3-channels, and I am just converting the grayscale images to 3-channel when I load them in the function load_image, so I haven't been trying to debug this further. I still have no idea what to change. I will update this if I do find a solution, but hearing from someone else who's already done grayscale images would be nice.
@SoraDevin I want to ask you how you converted the image into 3 channels in the in the function load_image. I made a mistake in the conversion.
@yuannver, there are a few ways, you can just copy the image to another 2 dimensions. I am just using skimage's gray2rgb function like so:
def load_image(self, image_id):
"""Load the specified image and return a [H,W,3] Numpy array.
Taken from utils.py, any refinements we need can be done here
"""
# Load image
image = skimage.io.imread(self.image_info[image_id]['path'])
# If grayscale. Convert to RGB for consistency.
if image.ndim != 3:
image = skimage.color.gray2rgb(image)
# If has an alpha channel, remove it for consistency
if image.shape[-1] == 4:
image = image[..., :3]
return image
def image_reference(self, image_id):
"""Return the path of the image."""
info = self.image_info[image_id]
return info["path"]`
1) Change the input dimension by setting channel count to 1 instead of 3.
self.IMAGE_SHAPE = np.array([self.IMAGE_MAX_DIM, self.IMAGE_MAX_DIM, self.IMAGE_CHANNEL_COUNT])
2) Change the mean pixel array shape to 1 as there is just one channel now.
MEAN_PIXEL = np.array([123.7]) instead of np.array([123.7, 116.8, 103.9])
3) Also, change the padding array in utils.resize_image() function
padding = [(top_pad, bottom_pad)] instead of
[(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
4) Change the load_image() function in utils.py file to handle single channel input.
image = skimage.io.imread(self.image_info[image_id]['path'])
# If grayscale. Convert to RGB for consistency.
if image.ndim != 1:
image = skimage.color.rgb2gray(image)
image = image[..., np.newaxis] #Extending the size of the image to be (h,w,1)
This worked for me when I am training on gray image dataset. However, I trained the model from scratch so I am not sure how to ignore the first layer while training using pretrained weights.
The padding change is something I didn't do and looks like it might help! The wiki (and my earlier post) also shows how to add the first layer (since it needs to be trained). I also found training all layers anyway improved my performance.
@SoraDevin @deepikakanade I'm getting this after making the changes mentioned:
batch_images[b] = mold_image(image.astype(np.float32), config)
ValueError: could not broadcast input array from shape (1024,720,1) into shape (1024,722,1)
Any ideas how to get past this? I'm guessing its something to do with the padding changes
- Change the input dimension by setting channel count to 1 instead of 3.
self.IMAGE_SHAPE = np.array([self.IMAGE_MAX_DIM, self.IMAGE_MAX_DIM, self.IMAGE_CHANNEL_COUNT])
- Change the mean pixel array shape to 1 as there is just one channel now.
MEAN_PIXEL = np.array([123.7]) instead of np.array([123.7, 116.8, 103.9])
- Also, change the padding array in utils.resize_image() function
padding = [(top_pad, bottom_pad)] instead of
[(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
- Change the load_image() function in utils.py file to handle single channel input.
image = skimage.io.imread(self.image_info[image_id]['path'])
# If grayscale. Convert to RGB for consistency. if image.ndim != 1: image = skimage.color.rgb2gray(image) image = image[..., np.newaxis] #Extending the size of the image to be (h,w,1)
This worked for me when I am training on gray image dataset. However, I trained the model from scratch so I am not sure how to ignore the first layer while training using pretrained weights.
model.load_weights(weights_path, by_name=True, exclude=[ "conv1",
I'm trying to change the network to accept grayscale images and have followed the wiki steps to exclude conv1 when loading weights and include it in addition to heads when training. I am running into a shape mismatch when trying to train.
Was there some other modifications I needed to perform to fix this? I've followed all the steps listed on the wiki as follows:
My relevant setup code snippets are as follows:
I also modified model.py to include conv1 in "heads":
My images are 12-bit, so I will try changing them and seeing if that helps, but this shape mismatch thing with the input layer has me stumped. Did anyone else run into this issue when using grayscale images?