Re-training MaskRCNN on custom dataset with multiple classes only trains with a single class, ignoring the others

Hi there,

I've previously used MMDet v2 and now have switched to v3 to retrain a MaskRCNN model on a custom dataset. Some background info about my dataset and the adaptations made to customize things:

Data consists of 4 channel numpy images (RGB plus additional channel), split into train (80%) and test (20%) sets. Due to the small amount of data, I do not have a separate validation set. (Hoping to implement k-fold in future)
Annotations were converted to the COCO format required by MMDet. 11 classes are defined and polygons labelled wherever they occur in the images (so there are multiple annotations per image, often of different classes). Of course both train and test datasets have their own individual COCO annotation file.
train.py and test.py are adapted to match my requirements and some custom modules were created where necessary: data loader CustomDataPreprocessor and LoadNumpyImageFromFile (inherits from LoadImageFromFile), visualizer hook NumpyDetVisualizationHook (inherits from DetVisualizationHook).

Procedure

I used the following commands to call train.py and test.py:

python train.py /.../configs/mmdet/mask_rcnn/mask-rcnn_r50_fpn_1x_coco.py 
--work-dir /.../model/outputs/

# the model from the last epoch is evaluated (here: epoch 60)
python test.py /.../configs/mmdet/mask_rcnn/mask-rcnn_r50_fpn_1x_coco.py /.../model/outputs/epoch_60.pth 
--work-dir /.../model/outputs/eval/ 
--out /.../model/outputs/eval/predictions_epoch-60.pickle 
--show-dir /.../model/outputs/eval/plots/

# second script call with classwise evaluation
python test.py /.../configs/mmdet/mask_rcnn/mask-rcnn_r50_fpn_1x_coco.py /.../model/outputs/epoch_60.pth 
--work-dir /.../model/outputs/eval_classwise/ 
--out /.../model/outputs/eval_classwise/predictions_epoch-60.pickle 
--show-dir /.../model/outputs/eval_classwise/plots/ 
--cfg-options test_evaluator.classwise=True

Expected results

Model trained to identify instances in images of all 11 classes.

the first test.py call should return overall metrics and plot all annotations and predicted instances side by side in the plots/ folder,
the second test.py call should show the individual metrics for each of the 10 classes and plot separate images for each class with ground truth annotations and predictions side by side.

Actual results

Everything runs without errors, but the test.py outputs show only a single class out of the 11 is being displayed in the plots and only the resulting metrics of that class are being shown.

Both test.py calls plot the same thing - side by side images showing only the ground truth "person" annotations and predicted "person" instances. Both overall and classwise metrics are the same. These are also the same as the outputs from the train.py script, which merely shows the metrics for the test dataset, as I don't have a validation dataset.

I can't figure out where things are going wrong - if there's an issue with training or just in the display of the results. The class that is shown ("person") is the 9th out of 11, but is the last class to occur in both train and test datasets going by order of images, so maybe the outputs are being overwritten so only the last one remains?

Thanks in advance for any help, ideas or assistance you can provide! I've added more details below.

Details

Below you'll find config excerpts from the log, which in this case is the same for train.py and test.py. For evaluation, I used the standard CocoMetric and, through the DetLocalVisualizer, DumpDetResults. Important, relevant changes are in bold.

2023/06/28 10:02:20 - mmengine - INFO -
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.8.12 (default, Sep 16 2021, 10:46:05) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)]
    CUDA available: True
    numpy_random_seed: 42
    GPU 0: NVIDIA A100-PCIE-40GB
    CUDA_HOME: /.../cuda/11.8
    NVCC: Cuda compilation tools, release 11.8, V11.8.89
    GCC: gcc (GCC) 11.2.0
    PyTorch: 2.0.1+cu118
    PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.8
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.7
  - Magma 2.6.1
  - Build settings: [...]

    TorchVision: 0.15.2+cu118
    OpenCV: 4.7.0
    MMEngine: 0.7.4

Runtime environment:
    cudnn_benchmark: False
    mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
    dist_cfg: {'backend': 'nccl'}
    seed: 42
    Distributed launcher: none
    Distributed training: False
    GPU number: 1
------------------------------------------------------------

2023/06/28 10:02:21 - mmengine - INFO - Config:
img_norm_cfg = dict(
    mean=[18.72, 19.515, 18.903, 130.248], std=[19.375, 21.03, 21.674, 43.674])
model = dict(
    type='MaskRCNN',
    data_preprocessor=dict(
        type='CustomDataPreprocessor',
        mean=[18.72, 19.515, 18.903, 130.248],
        std=[19.375, 21.03, 21.674, 43.674],
        pad_mask=True,
        pad_size_divisor=32),
    backbone=dict(
        type='ResNet',
        depth=50,
        in_channels=4,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=False,
        style='pytorch',
        init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        [...]),
    roi_head=dict(
        type='StandardRoIHead',
        [...]),
    train_cfg=dict([...]),
    test_cfg=dict([...])
)
data_root = '/.../model/data/'
train_img_prefix = 'train/'
val_img_prefix = 'test/'
dataset_type = 'CocoDataset'
metainfo = dict(
    CLASSES=('building', 'car (cold)', 'car (warm)', 'manhole (round) cold',
             'manhole (round) warm', 'manhole (square) cold',
             'manhole (square) warm', 'miscellaneous', 'person',
             'street lamp cold', 'street lamp warm'))
train_ann_file = 'train/annotations/thermal_annotations_coco.json'
val_ann_file = 'test/annotations/thermal_annotations_coco.json'
test_ann_file = 'test/annotations/thermal_annotations_coco.json'

train_pipeline = [
    dict(type='LoadNumpyImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    dict(type='Resize', scale=(3750, 3000), keep_ratio=True),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PackDetInputs')
]
test_pipeline = [
    dict(type='LoadNumpyImageFromFile'),
    dict(type='Resize', scale=(3750, 3000), keep_ratio=True),
    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    dict(
        type='PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    dataset=dict(
        type='CocoDataset',
        data_root='/.../model/data/',
        ann_file='/.../model/data/train/annotations/thermal_annotations_coco.json',
        data_prefix=dict(img='train/'),
        filter_cfg=dict(filter_empty_gt=True, min_size=32),
        pipeline=[
            dict(type='LoadNumpyImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
            dict(type='Resize', scale=(3750, 3000), keep_ratio=True),
            dict(type='RandomFlip', prob=0.5),
            dict(type='PackDetInputs')
        ],
        metainfo=dict(
            CLASSES=('building', 'car (cold)', 'car (warm)',
                     'manhole (round) cold', 'manhole (round) warm',
                     'manhole (square) cold', 'manhole (square) warm',
                     'miscellaneous', 'person', 'street lamp cold',
                     'street lamp warm'))))
val_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='CocoDataset',
        data_root='/.../model/data/',
        ann_file='/.../model/data/test/annotations/thermal_annotations_coco.json',
        data_prefix=dict(img='test/'),
        test_mode=True,
        pipeline=[
            dict(type='LoadNumpyImageFromFile'),
            dict(type='Resize', scale=(3750, 3000), keep_ratio=True),
            dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
            dict(
                type='PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ],
        metainfo=dict(
            CLASSES=('building', 'car (cold)', 'car (warm)',
                     'manhole (round) cold', 'manhole (round) warm',
                     'manhole (square) cold', 'manhole (square) warm',
                     'miscellaneous', 'person', 'street lamp cold',
                     'street lamp warm'))))
test_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type='CocoDataset',
        data_root='/.../model/data/',
        ann_file='/.../model/data/test/annotations/thermal_annotations_coco.json',
        data_prefix=dict(img='test/'),
        test_mode=True,
        pipeline=[
            dict(type='LoadNumpyImageFromFile'),
            dict(type='Resize', scale=(3750, 3000), keep_ratio=True),
            dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
            dict(
                type='PackDetInputs',
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                           'scale_factor'))
        ],
        metainfo=dict(
            CLASSES=('building', 'car (cold)', 'car (warm)',
                     'manhole (round) cold', 'manhole (round) warm',
                     'manhole (square) cold', 'manhole (square) warm',
                     'miscellaneous', 'person', 'street lamp cold',
                     'street lamp warm'))))

val_evaluator = dict(
    type='CocoMetric',
    ann_file='/.../model/data/test/annotations/thermal_annotations_coco.json',
    metric=['bbox', 'segm'],
    format_only=False)
test_evaluator = dict(
    type='CocoMetric',
    ann_file='/.../model/data/test/annotations/thermal_annotations_coco.json',
    metric=['bbox', 'segm'],
    format_only=False)

train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=60, val_interval=1)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
param_scheduler = [
    dict(
        type='LinearLR', start_factor=0.001, by_epoch=False, begin=0, end=500),
    dict(
        type='MultiStepLR',
        begin=0,
        end=12,
        by_epoch=True,
        milestones=[8, 11],
        gamma=0.1)
]
optim_wrapper = dict(
    type='OptimWrapper',
    optimizer=dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001))
auto_scale_lr = dict(enable=False, base_batch_size=16)
custom_imports = dict(
    imports=['numpy_loader', 'data_preprocessor'], allow_failed_imports=False)
randomness = dict(seed=42)
default_scope = 'mmdet'
default_hooks = dict(
    timer=dict(type='IterTimerHook'),
    logger=dict(type='LoggerHook', interval=50),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(type='CheckpointHook', interval=1),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    visualization=dict(type='NumpyDetVisualizationHook'))
env_cfg = dict(
    cudnn_benchmark=False,
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
    dist_cfg=dict(backend='nccl'))
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
    type='DetLocalVisualizer',
    vis_backends=[dict(type='LocalVisBackend')],
    name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
[...]
launcher = 'none'
work_dir = '/.../model/outputs/'

the results from train.py (displayed using the test dataset, as I don't have a validation one) were the following:
```
[...]
```

023/06/28 10:35:17 - mmengine - INFO - Evaluating bbox... 2023/06/28 10:35:17 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/28 10:35:17 - mmengine - INFO - Evaluating segm... 2023/06/28 10:35:17 - mmengine - INFO - segm_mAP_copypaste: 0.351 0.676 0.301 0.076 0.361 0.300 2023/06/28 10:35:17 - mmengine - INFO - Epoch(val) [60][80/80] coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3510 coco/segm_mAP_50: 0.6760 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 0.0538 time: 0.4626


- the first `test.py` call results in the following metrics:

2023/06/30 10:16:52 - mmengine - WARNING - The prefix is not set in metric class DumpDetResults. 2023/06/30 10:16:54 - mmengine - INFO - Load checkpoint from /.../model/outputs/epoch_60.pth 2023/06/30 10:20:37 - mmengine - INFO - Epoch(test) [ 50/159] eta: 0:07:51 time: 4.3254 data_time: 3.8272 memory: 3966 2023/06/30 10:24:13 - mmengine - INFO - Epoch(test) [100/159] eta: 0:04:16 time: 4.3728 data_time: 4.0667 memory: 3966 2023/06/30 10:27:51 - mmengine - INFO - Epoch(test) [150/159] eta: 0:00:39 time: 4.3717 data_time: 4.0541 memory: 3966 2023/06/30 10:28:27 - mmengine - INFO - Evaluating bbox... 2023/06/30 10:28:27 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/30 10:28:27 - mmengine - INFO - Evaluating segm... 2023/06/30 10:28:27 - mmengine - INFO - segm_mAP_copypaste: 0.350 0.675 0.301 0.076 0.361 0.300 2023/06/30 10:28:27 - mmengine - INFO - Results has been saved to /.../model/outputs/eval/predictions_epoch-60.pickle. 2023/06/30 10:28:27 - mmengine - INFO - Epoch(test) [159/159] coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3500 coco/segm_mAP_50: 0.6750 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 3.9675 time: 4.3337


- the second `test.py` with the additional `classwise=True` in `test_evaluator` shows only a single class (person) with the same metric results as the overall ones.

2023/06/30 14:05:50 - mmengine - WARNING - The prefix is not set in metric class DumpDetResults. 2023/06/30 14:05:57 - mmengine - INFO - Load checkpoint from /.../model/outputs/epoch_60.pth 2023/06/30 14:09:53 - mmengine - INFO - Epoch(test) [ 50/159] eta: 0:08:19 time: 4.5861 data_time: 3.8918 memory: 3966 2023/06/30 14:13:29 - mmengine - INFO - Epoch(test) [100/159] eta: 0:04:24 time: 4.3830 data_time: 4.0796 memory: 3966 2023/06/30 14:17:09 - mmengine - INFO - Epoch(test) [150/159] eta: 0:00:40 time: 4.3968 data_time: 4.0065 memory: 3966 2023/06/30 14:17:44 - mmengine - INFO - Evaluating bbox... 2023/06/30 14:17:44 - mmengine - INFO - +----------+-------+--------+--------+-------+-------+-------+ | category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l | +----------+-------+--------+--------+-------+-------+-------+ | person | 0.404 | 0.685 | 0.407 | 0.252 | 0.419 | 0.2 | +----------+-------+--------+--------+-------+-------+-------+ 2023/06/30 14:17:44 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/30 14:17:44 - mmengine - INFO - Evaluating segm... 2023/06/30 14:17:45 - mmengine - INFO - +----------+------+--------+--------+-------+-------+-------+ | category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l | +----------+------+--------+--------+-------+-------+-------+ | person | 0.35 | 0.675 | 0.301 | 0.076 | 0.361 | 0.3 | +----------+------+--------+--------+-------+-------+-------+ 2023/06/30 14:17:45 - mmengine - INFO - segm_mAP_copypaste: 0.350 0.675 0.301 0.076 0.361 0.300 2023/06/30 14:17:45 - mmengine - INFO - Results has been saved to /.../model/outputs/eval_classwise/predictions_epoch-60.pickle. 2023/06/30 14:17:45 - mmengine - INFO - Epoch(test) [159/159] coco/person_precision: 0.3500 coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3500 coco/segm_mAP_50: 0.6750 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 3.9770 time: 4.4266

open-mmlab / mmdetection

Re-training MaskRCNN on custom dataset with multiple classes only trains with a single class, ignoring the others #10578