Closed poempel123 closed 2 years ago
I think the easiest solution might be to resize the original image manually before making predictions and displaying the results. However, if this is not desirable for your application, you could also resize the masks after calling load_mask
.
I'll also try to train a model with different image sizes if I find the time.
Kind regards, Gilbert
Thank you so much for the fast reply, Gilbert! :-) I would be very grateful if you would try it! My python skills are very basic and I try to implement Mask RCNN in my master thesis...
I'll definetly try your suggestion!
So far, I've tried to return height and width from load_image
and forward it to extract_masks
. My plan was to exchange the fixed values (600, 800) with the variables height and width.
The problem is that the dependencies of the functions are quite confusing.
My solution approach:
changes in utils.py
load_image:
def load_image(self, image_id):
"""Load the specified image and return a [H,W,3] Numpy array.
"""
# Load image
image = skimage.io.imread(self.image_info[image_id]['path'])
height, width, num_channels = image.shape
#added this
# If grayscale. Convert to RGB for consistency.
if image.ndim != 3:
image = skimage.color.gray2rgb(image)
# If has an alpha channel, remove it for consistency
if num_channels == 4:
image = image[..., :3]
return image, height, width
#added this
changes in the notebook:
# Load and display random samples
image_ids = np.random.choice(dataset_train.image_ids, 4)
for image_id in image_ids:
image, height, width = dataset_train.load_image(image_id)
#added height, width as return of load_image
mask, class_ids = dataset_train.load_mask(image_id, height, width)
#added height, width as parameter of load_mask
visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names)
changes in load_mask
def load_mask(self, image_id, height, width):
#added height, width as parameter
# get details of image
info = self.image_info[image_id]
# define box file location
path = info['annotation']
# load XML
masks, classes = self.extract_masks(path, heigth, width)
added height, width as parameters to forward height and width to extract_masks
EDIT: Typo Mistake (heigth)
return masks, np.asarray(classes, dtype='int32')
changes in extract_masks
def extract_masks(self, filename, heigth, width):
#added height, width as parameter EDIT: Typo Mistake (heigth)
json_file = os.path.join(filename)
with open(json_file) as f:
img_anns = json.load(f)
masks = np.zeros([height, witdh, len(img_anns['shapes'])], dtype='uint8')
#exchanged 600 800 by height and width
classes = []
for i, anno in enumerate(img_anns['shapes']):
mask = np.zeros([height, width], dtype=np.uint8)
#exchanged 600 800 by height and width
cv2.fillPoly(mask, np.array([anno['points']], dtype=np.int32), 1)
masks[:, :, i] = mask
classes.append(self.class_names.index(anno['label']))
return masks, classes
I hope you can follow my thoughts :-D
Unfortunately, my code is not working. There are some troubles in the notebook cell # Load and display random samples
.
Eventhough the correct values are stored in height and width (while hovering above height
it shows the right value), I'm getting this error:
---> 41 masks, classes = self.extract_masks(path, heigth, width)
42 return masks, np.asarray(classes, dtype='int32')
43
`NameError: name 'heigth' is not defined``
Maybe this will help you to get it running. Thanks again, Gilbert!
Kind regards Henning
EDIT: Code is finally working!! :-) It was just a small typo mistake. Feel free to add the code to your tutorial! :-)
EDIT 2: Unfortunately, I've been glad too early. Now the training part is not working because there are some issues with model.py
load_image_gt()
Type Error: load_mask() missing 2 required positional arguments: 'height' and 'width'
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days
Hi Gilbert,
thank you for the great tutorial. After some work i got the segmentation with my own dataset running on Colab. I labeled a handful of pictures with labelme.
Unfortunately, all the images do have different heights and widths. Importing them with your code, the .json files are sized to 600x800 and do not allign with the images anymore (you can see it on the attached picture)
I tried to use the code from the balloon sample. It is also also a dataset with different image sizes. But I can't get it to run.
Can you provide me with some information how to change your code to get the sizes of the images and transfer them to the generated masks?
Many thanks in advance!