matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.57k stars 11.69k forks source link

Custom training using different sizes of images in dataset #2847

Open MatchaCookies opened 2 years ago

MatchaCookies commented 2 years ago

Hi I am doing a custom training of Mask RCNN with my dataset of different sizes and resolution of images. I have already done annotating using VGG annotating tool. I started training but Index errors appear while training:

Traceback (most recent call last): File "/content/drive/MyDrive/Train_Crack_June19/mrcnn/model.py", line 1870, in data_generator use_mini_mask=config.USE_MINI_MASK) File "/content/drive/MyDrive/Train_Crack_June19/mrcnn/model.py", line 1385, in load_image_gt mask, class_ids = dataset.load_mask(image_id) File "coco.py", line 316, in load_mask mask[rr, cc, i] = 1 IndexError: index 402 is out of bounds for axis 0 with size 402 ERROR:root:Error processing image {'id': '00401.jpg', 'source': 'object', 'path': 'Dataset/val/00401.jpg', 'width': 224, 'height': 224, 'polygons': [{'name': 'polygon', 'all_points_x': [5, 29, 43, 57, 63, 71, 71, 63, 61, 8, 6], 'all_points_y': [246, 250, 250, 249, 243, 243, 246, 253, 255, 255, 253]}, {'name': 'polygon', 'all_points_x': [97, 90, 105, 106, 116, 113, 106, 104, 108, 94, 86, 77, 72, 62, 57, 51, 38, 31, 12, 24, 41, 54, 73, 67, 88, 99, 84, 71, 83], 'all_points_y': [3, 18, 39, 45, 57, 78, 94, 127, 134, 142, 155, 169, 173, 195, 211, 218, 222, 223, 223, 212, 206, 174, 156, 148, 128, 63, 45, 18, 0]}], 'num_ids': [1, 1]} Traceback (most recent call last): File "/content/drive/MyDrive/Train_Crack_June19/mrcnn/model.py", line 1870, in data_generator use_mini_mask=config.USE_MINI_MASK) File "/content/drive/MyDrive/Train_Crack_June19/mrcnn/model.py", line 1385, in load_image_gt mask, class_ids = dataset.load_mask(image_id) File "coco.py", line 316, in load_mask mask[rr, cc, i] = 1 IndexError: index 243 is out of bounds for axis 0 with size 224

and so on..

This is my code `

   def load_coco(self, dataset_dir, subset):
                #, year=DEFAULT_DATASET_YEAR, class_ids=None
              #class_map=None, return_coco=False, auto_download=False):
    """Load a subset of the COCO dataset.
    dataset_dir: The root directory of the COCO dataset.
    subset: What to load (train, val, minival, valminusminival)
    year: What dataset year to load (2014, 2017) as a string, not an integer
    class_ids: If provided, only loads images that have the given classes.
    class_map: TODO: Not implemented yet. Supports maping classes from
        different datasets to the same class ID.
    return_coco: If True, returns the COCO object.
    auto_download: Automatically download and unzip MS-COCO images and annotations
    """
    # if auto_download is True:
    #     self.auto_download(dataset_dir, subset, year)

    self.add_class("object", 1, "crack")
    assert subset in ["train", "val"]
    dataset_dir = os.path.join(dataset_dir, subset)

    if subset == "train":
      annotations = json.load(open(os.path.join("/content/drive/MyDrive/Train_Crack_June19/Dataset/train", "train_json.json")))
    elif subset == "val":
      annotations = json.load(open(os.path.join("/content/drive/MyDrive/Train_Crack_June19/Dataset/val", "val_json.json")))

    annotations = list(annotations.values())
    annotations = [a for a in annotations if a['regions']]

    for a in annotations:
         # print(a)
        # Get the x, y coordinaets of points of the polygons that make up
        # the outline of each object instance. There are stores in the
        # shape_attributes (see json format above)
        polygons = [r['shape_attributes'] for r in a['regions']] 
        objects = [s['region_attributes']['crack'] for s in a['regions']]
        print("objects:",objects)
        name_dict = {"crack": 1}

        # key = tuple(name_dict)
        num_ids = [name_dict[a] for a in objects]

        # num_ids = [int(n['Event']) for n in objects]
        # load_mask() needs the image size to convert polygons to masks.
        # Unfortunately, VIA doesn't include it in JSON, so we must read
        # the image. This is only managable since the dataset is tiny.
        print("numids",num_ids)
        image_path = os.path.join(dataset_dir, a['filename'])
        image = skimage.io.imread(image_path)
        height, width = image.shape[:2]

        self.add_image(
            "object",  ## for a single class just add the name here
            image_id=a['filename'],  # use file name as a unique image id
            path=image_path,
            width=width, height=height,
            polygons=polygons,
            num_ids=num_ids
            )`

`

    def load_mask(self, image_id):
    """Generate instance masks for an image.
    Returns:
    masks: A bool array of shape [height, width, instance count] with
        one mask per instance.
    class_ids: a 1D array of class IDs of the instance masks.
    """
    # If not a Dog-Cat dataset image, delegate to parent class.
    image_info = self.image_info[image_id]
    if image_info["source"] != "object":
        return super(self.__class__, self).load_mask(image_id)

    # Convert polygons to a bitmap mask of shape
    # [height, width, instance_count]
    info = self.image_info[image_id]
    if info["source"] != "object":
        return super(self.__class__, self).load_mask(image_id)
    num_ids = info['num_ids']
    mask = np.zeros([info["height"], info["width"], len(info["polygons"])],
                    dtype=np.uint8)
    for i, p in enumerate(info["polygons"]):
        # Get indexes of pixels inside the polygon and set them to 1
      rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
      mask[rr, cc, i] = 1

    # Return mask, and array of class IDs of each instance. Since we have
    # one class ID only, we return an array of 1s
    # Map class names to class IDs.
    num_ids = np.array(num_ids, dtype=np.int32)
    return mask, num_ids #np.ones([mask.shape[-1]], dtype=np.int32)`

does training with different sized images causes it?

I saw this answer https://github.com/matterport/Mask_RCNN/issues/636#issuecomment-447751080 but all it did was modify the "mask[rr,cc,i] = 1". I reviewed my annotations but it doesn't have any overlapping vertex.

If I will modify the "mask[rr,cc,i] = 1" will this affect my training accuracy? Hoping for answers. Thanks.

nataliameira commented 2 years ago

@MatchaCookies How big are your images? Images cannot be too large.

LiuxinYLX commented 2 months ago

@MatchaCookies How big are your images? Images cannot be too large.

Could I ask you the limit of the images? Could you please tell me how to find it? I supposed that it should be smaller than 1616 x ____ .

Thank you very much in advance!

nataliameira commented 2 months ago

@LiuxinYLX Hello, I used mask r-cnn in 2021. At the time, I used the 1024x1024 image size. I hope this information helps you!

LiuxinYLX commented 2 months ago

@LiuxinYLX Hello, I used mask r-cnn in 2021. At the time, I used the 1024x1024 image size. I hope this information helps you!

OK I get it. Thank you very much for responding me so quickly and sweet >3