open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.13k stars 9.38k forks source link

cannot run faster rcnn on coco dataset #6108

Closed nyk2001 closed 3 years ago

nyk2001 commented 3 years ago

Hi,

I am trying to run faster rccn on coco128 dataset.

I have converted the dataset from YOLO to COCO format using fifttone library. The code is

from mmdet.apis import set_random_seed
from mmcv import Config

classes = ( 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
         'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
         'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
         'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
         'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
         'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
         'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
         'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
         'hair drier', 'toothbrush')

# Modify dataset type and path
cfg = Config.fromfile('./configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py')

cfg.dataset_type = 'CocoDataset'
cfg.data_root = 'data/coco128/'

cfg.data.test.type = 'CocoDataset'
cfg.data.test.ann_file = 'data/coco128/annotations/instances_val2017.json'

cfg.data.train.type = 'CocoDataset'
cfg.data.train.ann_file = 'data/coco128/annotations/instances_train2017.json'

cfg.data.val.type = 'CocoDataset'
cfg.data.val.ann_file = 'data/coco128/annotations/instances_val2017.json'

cfg.work_dir = './tutorial_exps'
cfg.classes = classes
# We divide it by 8 since we only use one GPU.
cfg.optimizer.lr = 0.02 / 8
cfg.lr_config.warmup = None
cfg.log_config.interval = 10

# Change the evaluation metric since we use customized dataset.
cfg.evaluation.metric = 'mAP'
# We can set the evaluation interval to reduce the evaluation times
cfg.evaluation.interval = 12
# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 12

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)

When I use the above config, it seems to work and starts running. After the first epoch I start getting NAN values for losses. Can someone help me what's going on ?

My directory structure is

image

hhaAndroid commented 3 years ago

@nyk2001 I suggest you directly create a new configuration instead of creating a new code so that we can easily troubleshoot problems.

nyk2001 commented 3 years ago

I am able to train the model now. But after evaluation, I get mAP=0

image

My config file is below. Any help will be really appreciated image

AronLin commented 3 years ago

You can create a new_config.py and Inherit the previous configuration file like faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-person.py

nyk2001 commented 3 years ago

whats wrong with my code ? Just trying to understand what's going on

AronLin commented 3 years ago

whats wrong with my code ? Just trying to understand what's going on

I still suggest you create a new config file. Otherwise, it will increase the workload for us to find problems, and it will not help answer your questions.

I found below problems:

  1. You use instance_train2017.json as your test.ann_file
  2. coco128 only has 128 images, the training dataset is too small and you have only trained for 6 epochs, so it is normal that you get the result of mAP=0.