open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
8.2k stars 2.6k forks source link

Avoid MultiScaleFlipAug during validation and test #1213

Open rafaelbou opened 2 years ago

rafaelbou commented 2 years ago

Hi all, Two questions about MultiScaleAug:

  1. Avoid MultiScaleAug during validate (while training):

I'm trying to validate my model (during training) without the MultiScaleAug, but I'm getting the error: File "/mmsegmentation/mmseg/datasets/pipelines/compose.py", line 41, in __call__ data = t(data) File "/mmsegmentation/mmseg/datasets/pipelines/formatting.py", line 281, in __call__ img_meta[key] = results[key] KeyError: 'flip' I can't find an example of validation without doing MultiScaleAug.

  1. Avoid MultiScaleAug during test:

Test time augmentation during testing need to be done explicitly in the testing script or it's inherits from the model's config file?

Thanks.


My config file with MultiScaleAug:

`base = '/mmsegmentation/configs/hrnet/fcn_hr18_512x1024_160k_cityscapes.py'

convert dataset annotation to semantic segmentation map

data_root = '/mmsegmentation/data' img_dir = 'images' ann_dir = 'labels'

Since we use ony one GPU, BN is used instead of SyncBN

norm_cfg = dict(type='BN', requires_grad=True) model = dict(

type='fcn_hr18',

decode_head=dict(
    # type='HRHead',
    num_classes=2
)

)

We can still use the pre-trained Mask RCNN model though we do not need to

use the mask branch

load_from = '/mmsegmentation/checkpoints/fcn_hr18_512x1024_160k_cityscapes_20200602_190822-221e4a4f.pth'

Set up working dir to save files and logs.

cfg.work_dir = './work_dirs/benchmark_train'

data = dict( samples_per_gpu=2, # Batch size of a single GPU workers_per_gpu=2, # Worker to pre-fetch data for each single GPU train=dict( # Train dataset config

type='CityscapesDataset', # Type of dataset, refer to mmseg/datasets/ for details.

    type='CityscapesDataset',  # Type of dataset, refer to mmseg/datasets/ for details.
    img_suffix='.png',
    seg_map_suffix='.png',
    data_root=data_root,  # The root of dataset.
    img_dir=img_dir + '/train',  # The image directory of dataset.
    ann_dir=ann_dir + '/train',  # The annotation directory of dataset.
    pipeline=[  # pipeline, this is passed by the train_pipeline created before.
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations'),
        # dict(type='Resize', img_scale=(1000, 300), ratio_range=(0.5, 2.0)),
        # dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
        dict(type='RandomFlip', flip_ratio=0.5),
        # dict(type='PhotoMetricDistortion'),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        # dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_semantic_seg'])
    ]
),
val=dict(  # Validation dataset config
    # type='CityscapesDataset',
    type='CityscapesDataset',
    img_suffix='.png',
    seg_map_suffix='.png',
    data_root=data_root,
    img_dir=img_dir + '/val',  # The image directory of dataset.
    ann_dir=ann_dir + '/val',  # The annotation directory of dataset.
    pipeline=[  # Pipeline is passed by test_pipeline created before
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1000, 300),
            flip=False,
            transforms=[
                # dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'])
            ])
    ]`

My config file without MultiScaleAug (validation is crushed):

`base = '//mmsegmentation/configs/hrnet/fcn_hr18_512x1024_160k_cityscapes.py'

convert dataset annotation to semantic segmentation map

data_root = './annotations' img_dir = 'images' ann_dir = 'labels'

Since we use ony one GPU, BN is used instead of SyncBN

norm_cfg = dict(type='BN', requires_grad=True) model = dict(

type='fcn_hr18',

decode_head=dict(
    # type='HRHead',
    num_classes=2
)

)

We can still use the pre-trained Mask RCNN model though we do not need to

use the mask branch

load_from = '/checkpoints/pretrained_nns/hrnetv2_w18-00eb2006.pth'

Set up working dir to save files and logs.

cfg.work_dir = './work_dirs/benchmark_train'

work_dir = '/wo_MultiScaleFlipAug'

data = dict( samples_per_gpu=2, # Batch size of a single GPU workers_per_gpu=2, # Worker to pre-fetch data for each single GPU train=dict( # Train dataset config

type='CityscapesDataset', # Type of dataset, refer to mmseg/datasets/ for details.

    type='CityscapesDataset',  # Type of dataset, refer to mmseg/datasets/ for details.
    img_suffix='.png',
    seg_map_suffix='.png',
    data_root=data_root,  # The root of dataset.
    img_dir=img_dir + '/train',  # The image directory of dataset.
    ann_dir=ann_dir + '/train',  # The annotation directory of dataset.
    pipeline=[  # pipeline, this is passed by the train_pipeline created before.
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations'),
        # dict(type='Resize', img_scale=(1000, 300), ratio_range=(0.5, 2.0)),
        # dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
        dict(type='RandomFlip', flip_ratio=0.5),
        # dict(type='PhotoMetricDistortion'),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        # dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_semantic_seg'])
    ]
),
val=dict(  # Validation dataset config
    # type='CityscapesDataset',
    type='CityscapesDataset',
    img_suffix='.png',
    seg_map_suffix='.png',
    data_root=data_root,
    img_dir=img_dir + '/val',  # The image directory of dataset.
    ann_dir=ann_dir + '/val',  # The annotation directory of dataset.
    pipeline=[  # Pipeline is passed by test_pipeline created before
        dict(type='LoadImageFromFile'),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='ImageToTensor', keys=['img']),
        dict(type='Collect', keys=['img'])
    ]`
MengzhangLI commented 2 years ago

Hi, sorry for late reply. In our default training and testing(if not using --aug-test), MultiScaleFlipAug would not be used.

For validation phase, you just need to follow test pipeline. For example, https://github.com/open-mmlab/mmsegmentation/blob/7512f05990eb66bba3653cb4d5f478965bf41bd7/configs/_base_/datasets/ade20k.py#L48 It would follow test_pipeline in the same config. You just need to set flip=False.

For test phase, if you do not use --aug-test, MultiScaleFlipAug would not be used. If it is used, some values would be set here:

https://github.com/open-mmlab/mmsegmentation/blob/master/tools/test.py#L131-L135

MengzhangLI commented 2 years ago

Your problem is caused by your certain uncomment & adding modifications in configs.

bok3948 commented 2 years ago

Hi, I have a question, how do you evaluate when you don't do multiscalefilp aug? (on ade20k evaluation)

  1. Resize to (512, 512) and normalize.
  2. Rescale to (512, 512) and normalize and padding to (512,512.)

Or are you evaluating it in a different way?