Align mask size to image size when using different training sizes

poempel123 commented 3 years ago

Hi Gilbert,

thank you for the great tutorial. After some work i got the segmentation with my own dataset running on Colab. I labeled a handful of pictures with labelme.

Unfortunately, all the images do have different heights and widths. Importing them with your code, the .json files are sized to 600x800 and do not allign with the images anymore (you can see it on the attached picture)

I tried to use the code from the balloon sample. It is also also a dataset with different image sizes. But I can't get it to run.

Can you provide me with some information how to change your code to get the sizes of the images and transfer them to the generated masks?

Many thanks in advance!

Screenshot 2021-06-06 173356

TannerGilbert commented 3 years ago

I think the easiest solution might be to resize the original image manually before making predictions and displaying the results. However, if this is not desirable for your application, you could also resize the masks after calling load_mask.

I'll also try to train a model with different image sizes if I find the time.

Kind regards, Gilbert

poempel123 commented 3 years ago

Thank you so much for the fast reply, Gilbert! :-) I would be very grateful if you would try it! My python skills are very basic and I try to implement Mask RCNN in my master thesis...

I'll definetly try your suggestion!

So far, I've tried to return height and width from load_image and forward it to extract_masks. My plan was to exchange the fixed values (600, 800) with the variables height and width.

The problem is that the dependencies of the functions are quite confusing.

My solution approach: changes in utils.py load_image:

def load_image(self, image_id): """Load the specified image and return a [H,W,3] Numpy array. """ # Load image image = skimage.io.imread(self.image_info[image_id]['path']) height, width, num_channels = image.shape #added this # If grayscale. Convert to RGB for consistency. if image.ndim != 3: image = skimage.color.gray2rgb(image) # If has an alpha channel, remove it for consistency if num_channels == 4: image = image[..., :3] return image, height, width #added this

changes in the notebook:

# Load and display random samples image_ids = np.random.choice(dataset_train.image_ids, 4) for image_id in image_ids: image, height, width = dataset_train.load_image(image_id) #added height, width as return of load_image mask, class_ids = dataset_train.load_mask(image_id, height, width) #added height, width as parameter of load_mask visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names)

changes in load_mask def load_mask(self, image_id, height, width): #added height, width as parameter # get details of image info = self.image_info[image_id] # define box file location path = info['annotation'] # load XML masks, classes = self.extract_masks(path, heigth, width) added height, width as parameters to forward height and width to extract_masks EDIT: Typo Mistake (heigth) return masks, np.asarray(classes, dtype='int32')

changes in extract_masks def extract_masks(self, filename, heigth, width): #added height, width as parameter EDIT: Typo Mistake (heigth) json_file = os.path.join(filename) with open(json_file) as f: img_anns = json.load(f)

masks = np.zeros([height, witdh, len(img_anns['shapes'])], dtype='uint8') #exchanged 600 800 by height and width classes = [] for i, anno in enumerate(img_anns['shapes']): mask = np.zeros([height, width], dtype=np.uint8) #exchanged 600 800 by height and width cv2.fillPoly(mask, np.array([anno['points']], dtype=np.int32), 1) masks[:, :, i] = mask classes.append(self.class_names.index(anno['label'])) return masks, classes

I hope you can follow my thoughts :-D

Unfortunately, my code is not working. There are some troubles in the notebook cell # Load and display random samples. Eventhough the correct values are stored in height and width (while hovering above height it shows the right value), I'm getting this error:

---> 41 masks, classes = self.extract_masks(path, heigth, width) 42 return masks, np.asarray(classes, dtype='int32') 43

`NameError: name 'heigth' is not defined``

Maybe this will help you to get it running. Thanks again, Gilbert!

Kind regards Henning

EDIT: Code is finally working!! :-) It was just a small typo mistake. Feel free to add the code to your tutorial! :-)

EDIT 2: Unfortunately, I've been glad too early. Now the training part is not working because there are some issues with model.py load_image_gt() Type Error: load_mask() missing 2 required positional arguments: 'height' and 'width'

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

TannerGilbert / MaskRCNN-Object-Detection-and-Segmentation

Align mask size to image size when using different training sizes #3