bethgelab / siamese-mask-rcnn

Siamese Mask R-CNN model for one-shot instance segmentation
Other
346 stars 60 forks source link

One-Shot Detection - input_target shape Error #33

Open JHevia23 opened 2 years ago

JHevia23 commented 2 years ago

Hi! Amazing work and very nice codebase overall. I enjoyed checking the architecture.

I tried testing the model on the "small" configuration with a single query image and a reference, loading them from cv2:

Basically:

class_img = cv2.imread(f"./data/ligilog/LigiLog-100/classes/images/{sample_class_id}.jpg")
class_img = cv2.cvtColor(class_img, cv2.COLOR_BGR2RGB)
image_img = cv2.imread(f"./data/ligilog/LigiLog-100/src/images/{sample_image_id}.jpg")
image_img = cv2.cvtColor(image_img, cv2.COLOR_BGR2RGB)
model.detect([class_img], [image_img], verbose=3, random_detections=False)[0]

and I'm finding the following issue:

ValueError                                Traceback (most recent call last)
<ipython-input-53-29daa0ddecdf> in <module>
----> 1 model.detect([class_img], [image_img], verbose=3, random_detections=False)[0]
      2 # model.detect([np.reshape(class_img, tuple([1] + list(class_img.shape)))], [image_img], verbose=2, random_detections=False)[0]

~/osod/siamese-mask-rcnn/lib/model.py in detect(self, targets, images, verbose, random_detections, eps)
    769         # CHANGE: Use siamese detection model
    770         detections, _, _, mrcnn_mask, _, _, _ =\
--> 771             self.keras_model.predict([molded_images, image_metas, molded_targets, anchors], verbose=2)
    772         if random_detections:
    773             # Randomly shift the detected boxes

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose, steps)
   1162                              'argument.')
   1163         # Validate user data.
-> 1164         x, _, _ = self._standardize_user_data(x)
   1165         if self.stateful:
   1166             if x[0].shape[0] > batch_size and x[0].shape[0] % batch_size != 0:

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    755             feed_input_shapes,
    756             check_batch_axis=False,  # Don't enforce the batch size.
--> 757             exception_prefix='input')
    758 
    759         if y is not None:

~/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix, check_last_layer_shape)
    129                         ': expected ' + names[i] + ' to have ' +
    130                         str(len(shape)) + ' dimensions, but got array '
--> 131                         'with shape ' + str(data_shape))
    132                 if not check_batch_axis:
    133                     data_shape = data_shape[1:]

ValueError: Error when checking input: expected input_target to have 5 dimensions, but got array with shape (1, 57, 266, 3)

I saw that the input_target shape is a function of config.NUM_TARGETS and config.TARGET_SHAPE, however I tried playing with those 2 values and got no solution.

Could you point me at the change I'd have to do in the configuration for this to be solved?

Thanks!

michaelisc commented 2 years ago

Sorry for the late reply: I guess you need to add another dimension for the target (but not the scene) input such that it is (1, 1, 57, 266, 3). For few-shot experiments, you can use that to have e.g. 5 or 10 targets.

JHevia23 commented 2 years ago

Cool! I tried that and then found out I also needed to resize the query image to 96x96 . Do you know if performance varies significantly with query resizing?

michaelisc commented 2 years ago

I didn't find it to have a huge impact. Image size was more important than reference size.