open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.59k stars 9.46k forks source link

how does this category information keep up to date during the interface multi-task training run? #12011

Closed 1wang11lijian1 closed 2 weeks ago

1wang11lijian1 commented 3 weeks ago

Hello developers, I have a scene here encountered a problem, I very much hope that you can provide solutions or solutions, I through the python interface training of different detection tasks, the first task can be started smoothly, the second task will always report errors, the error is as follows: ValueError: need at least one array to concatenate. So I looked for the reasons myself, probably because of these two things:

  1. classes and palette METAINFO in \mmdet\datasets\coco.py did not update the class and palette information in time.
  2. \mmdet\evaluation\functional\class_names.py coco_classes() does not return updated class information.

So I would like to ask you, how does this category information keep up to date during the interface multi-task training run? What I tried before didn't seem to work.

Here's what I tried to fix \mmdet\datasets\coco.py, the file Objectdataset_config.yaml changes category and color palette information every time you change a different task:

with open('./Configs/Objectdataset_config.yaml', 'r', encoding='utf-8') as f:
        Object_config = yaml.safe_load(f)
        classes = tuple(Object_config['classes'])
        palette = Object_config['palette']

    METAINFO = {
            'classes':classes,
            'palette':palette
    }

Here's what I tried to fix \mmdet\evaluation\functional\class_names.py coco_classes():

def coco_classes() -> list:
    """Class names of COCO."""
    with open('./Configs/Objectdataset_config.yaml', 'r', encoding='utf-8') as f:
        Object_config = yaml.safe_load(f)
        classes = list(Object_config['classes'])
    return classes
PeterVennerstrom commented 2 weeks ago

Can you describe the multi-task training in more detail? What tasks are you training?

Can you share your config where the dataset(s) are defined?

1wang11lijian1 commented 2 weeks ago

Hello is this, I have written a PYQT5 training interface, for different target detection tasks training, training script is as follows:

# Train
# load config
cfg = Config.fromfile(param_config)
cfg.work_dir = param_save_dir

# build the runner from config
if 'runner_type' not in cfg:
    # build the default runner
    runner = Runner.from_cfg(cfg)
else:
    # build customized runner from the registry
    # if 'runner_type' is set in the cfg
    runner = RUNNERS.build(cfg)

# start training
runner.train()

The profile here will vary depending on my different data sets, i.e. for different target detection tasks.

auto_scale_lr = dict(base_batch_size=64, enable=False)
backend_args = None
classes = ('Particle', )
data_preprocessor = dict(
    bgr_to_rgb=True,
    mean=[
        0,
        0,
        0,
    ],
    pad_size_divisor=32,
    std=[
        255.0,
        255.0,
        255.0,
    ],
    type='DetDataPreprocessor')
data_root = 'D:/训练数据/'
dataset_type = 'CocoDataset'
default_hooks = dict(
    checkpoint=dict(interval=10, type='CheckpointHook'),
    logger=dict(interval=10, type='LoggerHook'),
    param_scheduler=dict(type='ParamSchedulerHook'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    timer=dict(type='IterTimerHook'),
    visualization=dict(type='DetVisualizationHook'))
default_scope = 'mmdet'
env_cfg = dict(
    cudnn_benchmark=False,
    dist_cfg=dict(backend='nccl'),
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
images_suffix = '.bmp'
load_from = None
log_level = 'INFO'
log_processor = dict(by_epoch=True, type='LogProcessor', window_size=50)
model = dict(
    backbone=dict(
        depth=53,
        init_cfg=dict(checkpoint='open-mmlab://darknet53', type='Pretrained'),
        out_indices=(
            3,
            4,
            5,
        ),
        type='Darknet'),
    bbox_head=dict(
        anchor_generator=dict(
            base_sizes=[
                [
                    (
                        116,
                        90,
                    ),
                    (
                        156,
                        198,
                    ),
                    (
                        373,
                        326,
                    ),
                ],
                [
                    (
                        30,
                        61,
                    ),
                    (
                        62,
                        45,
                    ),
                    (
                        59,
                        119,
                    ),
                ],
                [
                    (
                        10,
                        13,
                    ),
                    (
                        16,
                        30,
                    ),
                    (
                        33,
                        23,
                    ),
                ],
            ],
            strides=[
                32,
                16,
                8,
            ],
            type='YOLOAnchorGenerator'),
        bbox_coder=dict(type='YOLOBBoxCoder'),
        featmap_strides=[
            32,
            16,
            8,
        ],
        in_channels=[
            512,
            256,
            128,
        ],
        loss_cls=dict(
            loss_weight=1.0,
            reduction='sum',
            type='CrossEntropyLoss',
            use_sigmoid=True),
        loss_conf=dict(
            loss_weight=1.0,
            reduction='sum',
            type='CrossEntropyLoss',
            use_sigmoid=True),
        loss_wh=dict(loss_weight=2.0, reduction='sum', type='MSELoss'),
        loss_xy=dict(
            loss_weight=2.0,
            reduction='sum',
            type='CrossEntropyLoss',
            use_sigmoid=True),
        num_classes=1,
        out_channels=[
            1024,
            512,
            256,
        ],
        type='YOLOV3Head'),
    data_preprocessor=dict(
        bgr_to_rgb=True,
        mean=[
            0,
            0,
            0,
        ],
        pad_size_divisor=32,
        std=[
            255.0,
            255.0,
            255.0,
        ],
        type='DetDataPreprocessor'),
    neck=dict(
        in_channels=[
            1024,
            512,
            256,
        ],
        num_scales=3,
        out_channels=[
            512,
            256,
            128,
        ],
        type='YOLOV3Neck'),
    test_cfg=dict(
        conf_thr=0.005,
        max_per_img=100,
        min_bbox_size=0,
        nms=dict(iou_threshold=0.45, type='nms'),
        nms_pre=1000,
        score_thr=0.05),
    train_cfg=dict(
        assigner=dict(
            min_pos_iou=0,
            neg_iou_thr=0.5,
            pos_iou_thr=0.5,
            type='GridAssigner')),
    type='YOLOV3')
optim_wrapper = dict(
    clip_grad=dict(max_norm=35, norm_type=2),
    optimizer=dict(lr=0.0001, momentum=0.9, type='SGD', weight_decay=0.0005),
    type='OptimWrapper')
palette = [
    [
        128,
        64,
        128,
    ],
]
param_scheduler = [
    dict(begin=0, by_epoch=False, end=2000, start_factor=0.1, type='LinearLR'),
    dict(
        by_epoch=True, gamma=0.1, milestones=[
            218,
            246,
        ], type='MultiStepLR'),
]
resume = False
resume_from = None
test_cfg = dict(type='TestLoop')
test_dataloader = dict(
    batch_size=1,
    dataset=dict(
        ann_file=
        'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/annotations/instances_val2017.json',
        backend_args=None,
        data_prefix=dict(
            img=
            'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/images/'
        ),
        data_root='data/coco/',
        pipeline=[
            dict(backend_args=None, type='LoadImageFromFile'),
            dict(keep_ratio=True, scale=(
                448,
                448,
            ), type='Resize'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                meta_keys=(
                    'img_id',
                    'img_path',
                    'ori_shape',
                    'img_shape',
                    'scale_factor',
                ),
                type='PackDetInputs'),
        ],
        test_mode=True,
        type='CocoDataset'),
    drop_last=False,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(
    ann_file=
    'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/annotations/instances_val2017.json',
    backend_args=None,
    metric='bbox',
    type='CocoMetric')
test_pipeline = [
    dict(backend_args=None, type='LoadImageFromFile'),
    dict(keep_ratio=True, scale=(
        448,
        448,
    ), type='Resize'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        meta_keys=(
            'img_id',
            'img_path',
            'ori_shape',
            'img_shape',
            'scale_factor',
        ),
        type='PackDetInputs'),
]
train_cfg = dict(max_epochs=273, type='EpochBasedTrainLoop', val_interval=7)
train_dataloader = dict(
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    batch_size=2,
    dataset=dict(
        ann_file=
        'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/annotations/instances_train2017.json',
        backend_args=None,
        data_prefix=dict(
            img=
            'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/images/'
        ),
        data_root='data/coco/',
        filter_cfg=dict(filter_empty_gt=True, min_size=32),
        pipeline=[
            dict(backend_args=None, type='LoadImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                mean=[
                    0,
                    0,
                    0,
                ],
                ratio_range=(
                    1,
                    2,
                ),
                to_rgb=True,
                type='Expand'),
            dict(
                min_crop_size=0.3,
                min_ious=(
                    0.4,
                    0.5,
                    0.6,
                    0.7,
                    0.8,
                    0.9,
                ),
                type='MinIoURandomCrop'),
            dict(
                keep_ratio=True,
                scale=[
                    (
                        448,
                        448,
                    ),
                    (
                        448,
                        448,
                    ),
                ],
                type='RandomResize'),
            dict(prob=0.5, type='RandomFlip'),
            dict(type='PhotoMetricDistortion'),
            dict(type='PackDetInputs'),
        ],
        type='CocoDataset'),
    num_workers=2,
    persistent_workers=True,
    sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
    dict(backend_args=None, type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(mean=[
        0,
        0,
        0,
    ], ratio_range=(
        1,
        2,
    ), to_rgb=True, type='Expand'),
    dict(
        min_crop_size=0.3,
        min_ious=(
            0.4,
            0.5,
            0.6,
            0.7,
            0.8,
            0.9,
        ),
        type='MinIoURandomCrop'),
    dict(
        keep_ratio=True,
        scale=[
            (
                448,
                448,
            ),
            (
                448,
                448,
            ),
        ],
        type='RandomResize'),
    dict(prob=0.5, type='RandomFlip'),
    dict(type='PhotoMetricDistortion'),
    dict(type='PackDetInputs'),
]
val_cfg = dict(type='ValLoop')
val_dataloader = dict(
    batch_size=1,
    dataset=dict(
        ann_file=
        'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/annotations/instances_val2017.json',
        backend_args=None,
        data_prefix=dict(
            img=
            'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/images/'
        ),
        data_root='data/coco/',
        pipeline=[
            dict(backend_args=None, type='LoadImageFromFile'),
            dict(keep_ratio=True, scale=(
                448,
                448,
            ), type='Resize'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(
                meta_keys=(
                    'img_id',
                    'img_path',
                    'ori_shape',
                    'img_shape',
                    'scale_factor',
                ),
                type='PackDetInputs'),
        ],
        test_mode=True,
        type='CocoDataset'),
    drop_last=False,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(
    ann_file=
    'E:\\Work\\Projects_test\\Keyboard_Object/data/data_coco/coco/annotations/instances_val2017.json',
    backend_args=None,
    metric='bbox',
    type='CocoMetric')
vis_backends = [
    dict(type='LocalVisBackend'),
]
visualizer = dict(
    name='visualizer',
    type='DetLocalVisualizer',
    vis_backends=[
        dict(type='LocalVisBackend'),
    ])

The first task starts without a problem, and the second task keeps getting errors like this: ValueError: need at least one array to concatenate. So I would like to ask you, how does this category information keep up to date during the interface multi-task training run? What I tried before didn't seem to work. Can you help me with this question?

1wang11lijian1 commented 2 weeks ago

Okay, it's settled. modify main function in tools/train.py

from mmdet.datasets.coco import CocoDataset
CocoDataset.METAINFO = {'classes':('fire',),'palette':[(220, 20, 60), ]}