open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.14k stars 9.39k forks source link

Custom COCO dataset - testing results empty #3559

Closed starkgate closed 4 years ago

starkgate commented 4 years ago

I'm trying to train a network to be able to detect rocks and obstacles in a rover's path (example below). frame001045

I have a labeled dataset with bounding boxes around the obstacles. The dataset is split into train, test and val folders with 3 corresponding JSONs in the COCO format (example below). I'm trying to train DetectoRS with it (python tools/train.py configs/detectors/cascade_rcnn_r50_sac_1x_coco.py), but I'm getting an error: The testing results of the whole dataset is empty. (complete log). I'm not sure if the problem comes from the dataset or something else. If someone could look over the formatting of the JSON and tell me if they can see anything wrong, I'd appreciate it.

{
  "images": [
    {
      "file_name": "frame011053.png",
      "height": 480,
      "width": 640,
      "id": 7579
    }
...
  ],
  "annotations": [
    {
      "image_id": 5673,
      "category_id": 0,
      "id": 44,
      "bbox": [
        561,
        174,
        601,
        215
      ],
      "area": 129215,
      "iscrowd": 0,
      "segmentation": [
        561,
        174,
        1162,
        174,
        1162,
        389,
        561,
        389
      ]
    },
...
  ],
  "categories": [
    {
      "id": 0,
      "name": "rock"
    },
    {
      "id": 1,
      "name": "slope"
    },
    {
      "id": 2,
      "name": "rover"
    }
  ]
}

The coco_detection.py file:

dataset_type = 'CocoDataset'
data_root = 'data/msl/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/train.json',
        img_prefix=data_root + 'train/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline))
evaluation = dict(interval=5, metric='bbox')
xvjiarui commented 4 years ago

Hi @starkgate Firstly, you set log interval smaller like 5 to see the log output. Secondly, you may try to start with baseline mode like fast rcnn. DetectoRS may not be stable to train.

starkgate commented 4 years ago

Thank you @xvjiarui . Trained with python tools/train.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py (Fast-RCNN requires "proposal" files that I don't have) and evaluation = dict(interval=5, metric='bbox') Tested with python demo/image_demo.py data/rosbag/train/frame008310.png configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py work_dirs/faster_rcnn_r50_fpn_1x_coco/latest.pth Same result as before. I ran the training on 2 different datasets. One has 4.5k samples and 3 classes, the other has 120 samples and 1 class. How important is the size of the dataset? Maybe that's the issue?

I've only modified coco_dataset.py to account for my dataset (paths), but maybe I missed something in the config. Maybe there are other settings I should modify?

I also tried changing num_classes to 1 in faster_rcnn_r50_fpn.py, no change.

starkgate commented 4 years ago

Alright. Tried a few other datasets too, all end up with the same results (ie, none). How can I troubleshoot where it goes wrong? Is it possible that mistakes in the config would go unnoticed by mmdetection?

xvjiarui commented 4 years ago

Hi @starkgate Have you checked your annotation file? I may check it by modifying this file.

starkgate commented 4 years ago

Just tried, annotations look correct. Here is the code.

starkgate commented 4 years ago

And here is a full log of the last training I did. I was away for the weekend so I left it running for 7 hours, but weirdly the accuracy goes up to 100 in the first few batches of the first epoch.

An idea: I'm training on a single, local GPU. Is there anything I should set in the config for this particular case? In the train.py arguments or in Faster-RCNN's config. I'm thinking of samples_per_gpu=2, workers_per_gpu=2 in particular.

There's also this:

2020-08-21 08:39:38,038 - mmdet - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
daavoo commented 4 years ago

@starkgate You need so specify your custom classes in the configuration file (as indicated in the docs https://mmdetection.readthedocs.io/en/latest/tutorials/new_dataset.html):

classes = ("rock", "slope", "rover")
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/train.json',
        img_prefix=data_root + 'train/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline))
starkgate commented 4 years ago

Edit: Not sure what more I changed since my message below, but it works now. It might be the new tool I'm using to generate the json. Thank you for all the help.

@daavoo Thank you for the suggestion. I tried that, no change. I guess it's the dataset that's too monotone, but I'd have expected at least errors, not nothing at all. If you have other ideas, I'm all ears.

I'll post all the useful info again with all the changes I made. Here are the commands I'm running:

python tools/train.py configs/faster_rcnn/custom_faster_rcnn_r50_fpn_1x_coco.py --gpus 1
python demo/image_demo.py data/synth_rocks/train/render0002.png configs/faster_rcnn/custom_faster_rcnn_r50_fpn_1x_coco.py work_dirs/custom_faster_rcnn_r50_fpn_1x_coco/latest.pth

Here's my custom_object_detection.py file:

dataset_type = 'CocoDataset'
data_root = 'data/synth_rocks/'
classes = ["large_rock"]
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/train.json',
        img_prefix=data_root + 'train/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        classes=classes,
        ann_file=data_root + 'annotations/val.json',
        img_prefix=data_root + 'val/',
        pipeline=test_pipeline))
evaluation = dict(interval=1, metric='bbox')

And custom_faster_rcnn_r50_fpn_1x_coco.py:

_base_ = [
    '../_base_/models/faster_rcnn_r50_fpn.py',
    '../_base_/datasets/custom_detection.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

Lastly, training log: https://pastebin.com/evB8HGQx

kendyChina commented 4 years ago

I also encountered this problem and solved it with'classes=classes'.