matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Other
24.64k stars 11.7k forks source link

one type of class is predicted out of the seven+BG classes.. Help required!!!! #1970

Open wink94 opened 4 years ago

wink94 commented 4 years ago

I have a dataset of hand-drawn flowcharts. After the training of the dataset, only one type of class is predicted. my initial training dataset consists of 40 images. This is my configuration class

`class FlowchartConfig(Config):`

    NAME = "Flowchart_symbol"

    IMAGES_PER_GPU = 2

    NUM_CLASSES = 1 + 7  # Background + flowchart

    STEPS_PER_EPOCH = 100

    DETECTION_MIN_CONFIDENCE = 0.9`

40-train,5-val

truongtd6285 commented 4 years ago

Did you modify the balloon.py file? There you can find some lines as:

def load_balloon(self, dataset_dir, subset):
        """Load a subset of the Apple dataset.
        dataset_dir: Root directory of the dataset.
        subset: Subset to load: train or val
        """
        # Add classes
        self.add_class("balloon", 1, "balloon1")         
        self.add_class("balloon", 2, "balloon2")
        self.add_class("balloon", 3, "balloon3")
        self.add_class("balloon", 4, "balloon4")
        self.add_class("balloon", 5, "balloon5")         
        self.add_class("balloon", 6, "balloon6")
        self.add_class("balloon", 7, "balloon7")
wink94 commented 4 years ago

@truongtd6285 thanks for the reply man. sorry for not mentioning it in my previous comment. I have already added them. `

``class FlowchartDataset(utils.Dataset):`

    def load_flowchart(self, dataset_dir, subset):

        self.add_class("Flowchart_symbol", 1, "arrow")
        self.add_class("Flowchart_symbol", 2, "data")
        self.add_class("Flowchart_symbol", 3, "process")
        self.add_class("Flowchart_symbol", 4, "decision")
        self.add_class("Flowchart_symbol", 5, "connection")
        self.add_class("Flowchart_symbol", 6, "text")
        self.add_class("Flowchart_symbol", 7, "terminator")
``
truongtd6285 commented 4 years ago

Did you check that your train data were loaded with all 7 labels, or just one label instead? In the train() function, the load_flowchart() is called again, you can check there to know actually which train data was loaded. And when you run inference, did you set the NUM_CLASSES = 1 + 7 as it was in training mode?

wink94 commented 4 years ago

@truongtd6285 These are my training configurations. i have highlighted NUM_CLASSES = 8

Configurations:
BACKBONE                       resnet101
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     2
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.9
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 2
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  1024
IMAGE_META_SIZE                20
IMAGE_MIN_DIM                  800
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1024 1024    3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           Flowchart_symbol

**NUM_CLASSES                    8**

POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                100
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001 

,,, for inference,

config = InferenceConfig()
config.display()

NUM_CLASSES=8 in the configurations

still can't find the issue with this

truongtd6285 commented 4 years ago

I guess something was wrong with your annotation files. How did you create those files? Do they have labels of all 7 classes?

wink94 commented 4 years ago

I did them with vgg annotator, one thing I did was use rectangle for annotation. but i changed the code accordingly. They contain labels of all 7.

BTW My dataset consists of only 40 training images.

 for i, p in enumerate(info["polygons"]):

            if p['name'] == 'rect':
                p['all_points_y'], p['all_points_x'] = [p['y'], p['y'] + p['height'], p['y'], p['y'] + p['height']], [p['x'], p['x'] + p['width'], p['x'] + p['width'], p['x']]

            rr, cc = skimage.draw.polygon(p['all_points_y'], p['all_points_x'])
            mask[rr, cc, i] = 1
truongtd6285 commented 4 years ago

That piece of code belongs to load_mask() function, not load_flowchart() function. Did you change something in load_flowchart() such as:

self.add_image(
                    "Flowchart_symbol",
                    image_id=a['filename'],  # use file name as a unique image id
                    path=image_path,
                    width=width, height=height,
                    polygons=polygons,
                    class_ids=classes_ids

where the class_ids on the right hand is an array with the same size as polygons array and its elements is one of {1,2,3,4,5,6,7}. The class_ids array in the balloon.py is a numpy. ones() array.

wink94 commented 4 years ago

@truongtd6285 I annotated all the classes in a single image and for the all images, I did the same thing. It would be helpful if u can elaborate more.

This is what is did there,

if type(a['regions']) is dict:
                polygons = [r['shape_attributes'] for r in a['regions'].values()]
            else:
                polygons = [r['shape_attributes'] for r in a['regions']] 

            class_names_str_temp  = [r['region_attributes'] for r in a['regions'] if a['regions']]
            # print(class_names_str_temp)
            class_name_nums = []
            class_names_str=[i for i in  class_names_str_temp if bool(i) != False]

            for i in  class_names_str:
                if i['Flowchart_symbols'] == 'arrow':
                    class_name_nums.append(1)
                if i['Flowchart_symbols'] == 'data':
                    class_name_nums.append(2)
                if i['Flowchart_symbols'] == 'process':
                    class_name_nums.append(3)
                if i['Flowchart_symbols'] == 'decision':
                    class_name_nums.append(4)
                if i['Flowchart_symbols'] == 'connection':
                    class_name_nums.append(5)
                if i['Flowchart_symbols'] == 'text':
                    class_name_nums.append(6)
                if i['Flowchart_symbols'] == 'terminator':
                    class_name_nums.append(7)

            # load_mask() needs the image size to convert polygons to masks.
            # Unfortunately, VIA doesn't include it in JSON, so we must read
            # the image. This is only managable since the dataset is tiny.
            image_path = os.path.join(dataset_dir, a['filename'])
            image = skimage.io.imread(image_path)
            height, width = image.shape[:2]

            self.add_image(
                "Flowchart_symbols",
                image_id=a['filename'],  # use file name as a unique image id
                path=image_path,
                width=width, height=height,
                polygons=polygons,
                class_ids = np.array(class_name_nums))
truongtd6285 commented 4 years ago

Have you tried to print out the value of class_name_nums,and what is its size? For this line class_names_str_temp = [r['region_attributes'] for r in a['regions'] if a['regions']], have you checked which values stored in class_names_str_temp? As I know the format of region_attributes should be: 'region_attributes': {name:'a'}