matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.57k stars 11.69k forks source link

No Idea How to Train Using a Custom COCO Format Dataset? #2608

Open Hana-Ali opened 3 years ago

Hana-Ali commented 3 years ago

First off, I'm really, really new to (practical data science). I've taken a course in ML before, but this is my first time doing any DL, and I don't even remember half of what I did in that course. In any case, I'm trying to train a model on the pklot dataset, obtained from https://public.roboflow.com/object-detection/pklot. The problem is, I have no idea how to train the model on a custom dataset. I wrote this code, modified slightly from https://towardsdatascience.com/mask-rcnn-implementation-on-a-custom-dataset-fd9a878123d4. I uploaded it as a notepad, trial1.txt:

trial1.txt

I also tried (emphasis on tried) to modify the actual coco.py file, thinking that would make it easier. It's uploaded as trial2.txt:

trial2.txt

Any help is appreciated, thank you! I don't even know what I'm looking at half the time, haha. Note that both obviously didn't work :/

nataliameira commented 3 years ago

Hi @Hana-Ali I'm new too.

I managed to train Mask R-CNN. Now I'm adjusting the hyperparameters.

I know little. If you want to chat, send me an email: nataliaf.meira@gmail.com

Att.

Tubhalooter commented 2 years ago

i am also not sure on how to get my custom dataset into the program , anyone help ?

nataliameira commented 2 years ago

Hello @Tubhalooter ,

I organized my directory like this:

image

Inside each folder (train and val) you must add the images and annotation file.

I hope it helps! Good luck!

Tubhalooter commented 2 years ago

Hello @Tubhalooter ,

I organized my directory like this:

image

Inside each folder (train and val) you must add the images and annotation file.

I hope it helps! Good luck!

i have this file setup but also have a coco format json file for each the training and validation

i am trying to modify the dataloader for my data but have not managed to. would it be possible to share how you managed to load in your data, this is what ive got so far



    def load_Path(self, dataset_dir, subset):
        """Load a subset of the Balloon dataset.
        dataset_dir: Root directory of the dataset.
        subset: Subset to load: train or val
        """
        # Add classes. We have only one class to add.
        self.add_class("path", 1, "path")

        # Train or validation dataset?
        assert subset in ["train", "val"]
        dataset_dir = os.path.join(dataset_dir, subset)

        # Load annotations
        # VGG Image Annotator (up to version 1.6) saves each image in the form:
        # { 'filename': '28503151_5b5b7ec140_b.jpg',
        #   'regions': {
        #       '0': {
        #           'region_attributes': {},
        #           'shape_attributes': {
        #               'all_points_x': [...],
        #               'all_points_y': [...],
        #               'name': 'polygon'}},
        #       ... more regions ...
        #   },
        #   'size': 100202
        # }
        # We mostly care about the x and y coordinates of each region
        # Note: In VIA 2.0, regions was changed from a dict to a list.
        if subset=="train":
          annotations = json.load(open("/content/drive/MyDrive/trainval/train.json"))# TODO:change these to variables holding dir paths
          annotations = list(annotations.values())  # don't need the dict keys
        elif subset=="val":
          annotations = json.load(open("/content/drive/MyDrive/trainval/val.json")) # TODO:change these to variables holding dir paths
          annotations = list(annotations.values())  # don't need the dict keys
        else:
          print("invalid subset")

        # The VIA tool saves images in the JSON even if they don't have any
        # annotations. Skip unannotated images.
        annotations = [a for a in annotations if a['regions']]

        # Add images
        for a in annotations:
            # Get the x, y coordinaets of points of the polygons that make up
            # the outline of each object instance. These are stores in the
            # shape_attributes (see json format above)
            # The if condition is needed to support VIA versions 1.x and 2.x.
            if type(a['regions']) is dict:
                polygons = [r['shape_attributes'] for r in a['regions'].values()]
            else:
                polygons = [r['shape_attributes'] for r in a['regions']] 

            # load_mask() needs the image size to convert polygons to masks.
            # Unfortunately, VIA doesn't include it in JSON, so we must read
            # the image. This is only managable since the dataset is tiny.
            image_path = os.path.join(dataset_dir, a['filename'])
            image = skimage.io.imread(image_path)
            height, width = image.shape[:2]

            self.add_image(
                "path",
                image_id=a['filename'],  # use file name as a unique image id
                path=image_path,
                width=width, height=height,
                polygons=polygons)

    def load_mask(self, image_id):
        """Generate instance masks for an image.
       Returns:
        masks: A bool array of shape [height, width, instance count] with
            one mask per instance.
        class_ids: a 1D array of class IDs of the instance masks.
        """
        # If not a balloon dataset image, delegate to parent class.
        image_info = self.image_info[image_id]
        if image_info["source"] != "path":
            return super(self.__class__, self).load_mask(image_id)

        # Convert polygons to a bitmap mask of shape
        # [height, width, instance_count]
        info = self.image_info[image_id]
        mask = np.zeros([info["height"], info["width"], len(info["polygons"])],
                        dtype=np.uint8)
        for i, p in enumerate(info["polygons"]):
            # Get indexes of pixels inside the polygon and set them to 1
            rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
            mask[rr, cc, i] = 1

        # Return mask, and array of class IDs of each instance. Since we have
        # one class ID only, we return an array of 1s
        return mask.astype(np.bool), np.ones([mask.shape[-1]], dtype=np.int32)

    def image_reference(self, image_id):
        """Return the path of the image."""
        info = self.image_info[image_id]
        if info["source"] == "path":
            return info["path"]
        else:
            super(self.__class__, self).image_reference(image_id)```

```import json
# Training dataset.
dataset_train = PathDataset()
dataset_train.load_Path("/content/drive/MyDrive/trainval", "train")
dataset_train.prepare()

# Validation dataset
dataset_val = PathDataset()
dataset_val.load_Path("/content/drive/MyDrive/trainval", "val")
dataset_val.prepare()

# *** This training schedule is an example. Update to your needs ***
# Since we're using a very small dataset, and starting from
# COCO trained weights, we don't need to train too long. Also,
# no need to train all layers, just the heads should do it.
print("Training network heads")
pathmodel.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=10,
            layers='heads')```
Tubhalooter commented 2 years ago

@nataliameira

ok so i have now changed my code to fit a coco format and i think it is working except one part where it is choosing to load all images or a subset of images based on class id , i keep getting an error 'int' object is not iterable and i have tried using a tuple and a list and even setting it to false (this is for the class_ids parameter)

here is my code

class PathDataset(utils.Dataset):
    def load_paths(self, dataset_dir, subset,class_ids=None,
                  class_map=None, return_coco=False,):
        """Load a subset of the COCO dataset.
        dataset_dir: The root directory of the COCO dataset.
        subset: What to load (train, val, minival, valminusminival)
        class_ids: If provided, only loads images that have the given classes.
        class_map: TODO: Not implemented yet. Supports maping classes from
            different datasets to the same class ID.
        return_coco: If True, returns the COCO object.
        auto_download: Automatically download and unzip MS-COCO images and annotations
        """

        pathdata = COCO((os.path.join(dataset_dir,"{}.json".format(subset))))
        if subset == "minival" or subset == "valminusminival":
            subset = "val"
        image_dir = os.path.join(dataset_dir,subset)

        # Load all classes or a subset?
        if not class_ids:
            # All classes
            class_ids = sorted(pathdata.getCatIds())
        # All images or a subset?
        if class_ids:
            image_ids = []
            for id in class_ids:
                image_ids.extend(list(pathdata.getImgIds(catIds=[id])))
            # Remove duplicates
            image_ids = list(set(image_ids))
        else:
            # All images
            image_ids = list(pathdata.imgs.keys())

        # Add classes
        for i in class_ids:
            self.add_class("path", i, pathdata.loadCats(i)[0]["name"])

        # Add images
        for i in image_ids:
            self.add_image(
                "path", image_id=i,
                path=os.path.join(image_dir, pathdata.imgs[i]['file_name']),
                width=pathdata.imgs[i]["width"],
                height=pathdata.imgs[i]["height"],
                annotations=pathdata.loadAnns(pathdata.getAnnIds(
                    imgIds=[i], catIds=class_ids, iscrowd=None)))
        if return_coco:
            return pathdata

    def load_mask(self, image_id):
        """Load instance masks for the given image.
        Different datasets use different ways to store masks. This
        function converts the different mask format to one format
        in the form of a bitmap [height, width, instances].
        Returns:
        masks: A bool array of shape [height, width, instance count] with
            one mask per instance.
        class_ids: a 1D array of class IDs of the instance masks.
        """
        # If not a COCO image, delegate to parent class.
        image_info = self.image_info[image_id]
        if image_info["source"] != "coco":
            return super(PathDataset, self).load_mask(image_id)

        instance_masks = []
        class_ids = []
        annotations = self.image_info[image_id]["annotations"]
        # Build mask of shape [height, width, instance_count] and list
        # of class IDs that correspond to each channel of the mask.
        for annotation in annotations:
            class_id = self.map_source_class_id(
                "path.{}".format(annotation['category_id']))
            if class_id:
                m = self.annToMask(annotation, image_info["height"],
                                   image_info["width"])
                # Some objects are so small that they're less than 1 pixel area
                # and end up rounded out. Skip those objects.
                if m.max() < 1:
                    continue
                # Is it a crowd? If so, use a negative class ID.
                if annotation['iscrowd']:
                    # Use negative class ID for crowds
                    class_id *= -1
                    # For crowd masks, annToMask() sometimes returns a mask
                    # smaller than the given dimensions. If so, resize it.
                    if m.shape[0] != image_info["height"] or m.shape[1] != image_info["width"]:
                        m = np.ones([image_info["height"], image_info["width"]], dtype=bool)
                instance_masks.append(m)
                class_ids.append(class_id)

        # Pack instance masks into an array
        if class_ids:
            mask = np.stack(instance_masks, axis=2).astype(np.bool)
            class_ids = np.array(class_ids, dtype=np.int32)
            return mask, class_ids
        else:
            # Call super class to return an empty mask
            return super(PathDataset, self).load_mask(image_id)

    def image_reference(self, image_id):
        """Return a link to the image in the COCO Website."""
        info = self.image_info[image_id]
        if info["source"] == "coco":
            return "http://cocodataset.org/#explore?id={}".format(info["id"])
        else:
            super(PathDataset, self).image_reference(image_id)

and to prepare and train the dataset and model

# Training dataset.
dataset_train = PathDataset()
dataset_train.load_paths("/content/drive/MyDrive/trainval","train",class_ids=False,class_map=False,return_coco=False)#class_ids left as None for now
dataset_train.prepare()

# Validation dataset
dataset_val = PathDataset()
dataset_val.load_paths("/content/drive/MyDrive/trainval","val",class_ids=1,class_map=None,return_coco=False)
dataset_val.prepare()

# *** This training schedule is an example. Update to your needs ***
# Since we're using a very small dataset, and starting from
# COCO trained weights, we don't need to train too long. Also,
# no need to train all layers, just the heads should do it.
print("Training network heads")
pathmodel.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE, 
            epochs=10,
            layers='heads')