Closed pengzhao-life closed 2 years ago
Hi @pengzhao-life Could you provide your model config? If the model is not used for binary classification, the error occurs.
Hi, Thank you so much for the response! My settings are as below:
from mmseg.apis import set_random_seed
cfg.norm_cfg = dict(type='BN', requires_grad=True)
cfg.model.decode_head.norm_cfg = cfg.norm_cfg cfg.model.auxiliary_head.norm_cfg = cfg.norm_cfg
cfg.model.decode_head.num_classes = 2 cfg.model.auxiliary_head.num_classes = 2
cfg.dataset_type = 'StanfordBackgroundDataset' cfg.data_root = data_root
cfg.data.samples_per_gpu = 8 cfg.data.workers_per_gpu=8
crop_size = (384,384) # (256,256) image_scale = (512,512) # (320, 240)
cfg.img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) cfg.crop_size = crop_size cfg.train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=image_scale, ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=cfg.crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', flip_ratio=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **cfg.img_norm_cfg), dict(type='Pad', size=cfg.crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ]
cfg.test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=image_scale,
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **cfg.img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
cfg.data.train.type = cfg.dataset_type cfg.data.train.data_root = cfg.data_root cfg.data.train.img_dir = img_dir cfg.data.train.ann_dir = ann_dir cfg.data.train.pipeline = cfg.train_pipeline cfg.data.train.split = 'splits/train.txt'
cfg.data.val.type = cfg.dataset_type cfg.data.val.data_root = cfg.data_root cfg.data.val.img_dir = img_dir cfg.data.val.ann_dir = ann_dir cfg.data.val.pipeline = cfg.test_pipeline cfg.data.val.split = 'splits/val.txt'
cfg.data.test.type = cfg.dataset_type cfg.data.test.data_root = cfg.data_root cfg.data.test.img_dir = img_dir cfg.data.test.ann_dir = ann_dir cfg.data.test.pipeline = cfg.test_pipeline cfg.data.test.split = 'splits/val.txt'
cfg.load_from = '../../../../checkpoints/upernet_swin_base_patch4_window12_512x512_160k_ade20k_pretrain_384x384_22K_20210531_125459-429057bf.pth'
cfg.work_dir = '/tmp/peng-tmp'
cfg.runner.max_iters = 10000 cfg.log_config.interval = 10 cfg.evaluation.interval = 200 cfg.checkpoint_config.interval = 200
cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1)
print(f'Config:\n{cfg.pretty_text}')
in 'upernet_swin.py', the loss function is modified as below:
norm_cfg = dict(type='SyncBN', requires_grad=True) backbone_norm_cfg = dict(type='LN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained=None, backbone=dict( type='SwinTransformer', pretrain_img_size=224, embed_dims=96, patch_size=4, window_size=7, mlp_ratio=4, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], strides=(4, 2, 2, 2), out_indices=(0, 1, 2, 3), qkv_bias=True, qk_scale=None, patch_norm=True, drop_rate=0., attn_drop_rate=0., drop_path_rate=0.3, use_abs_pos_embed=False, act_cfg=dict(type='GELU'), norm_cfg=backbone_norm_cfg), decode_head=dict( type='UPerHead', in_channels=[96, 192, 384, 768], in_index=[0, 1, 2, 3], pool_scales=(1, 2, 3, 6), channels=512, dropout_ratio=0.1, num_classes=19, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0, loss_name='loss_lovasz')), auxiliary_head=dict( type='FCNHead', in_channels=384, in_index=2, channels=256, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=19, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4, loss_name='loss_lovasz')),
train_cfg=dict(), test_cfg=dict(mode='whole'))
Train and evaluation:
datasets = [build_dataset(cfg.data.train)]
model = build_segmentor(cfg.model)
model.CLASSES = datasets[0].CLASSES
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_segmentor(model, datasets, cfg, distributed=False, validate=True, meta=dict())
Thanks, Peng
From: 谢昕辰 @.> Sent: Monday, June 20, 2022 11:28 PM To: open-mmlab/mmsegmentation @.> Cc: Peng Zhao @.>; Mention @.> Subject: Re: [open-mmlab/mmsegmentation] I got same as reported in Lovasz loss #1036 (Issue #1681)
Hi @pengzhao-lifehttps://github.com/pengzhao-life Could you provide your model config? If the model is not used for binary classification, the error occurs.
― Reply to this email directly, view it on GitHubhttps://github.com/open-mmlab/mmsegmentation/issues/1681#issuecomment-1161199078, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABZZO2JI7RPCW36R2X2KS53VQEZGZANCNFSM5ZHL23OQ. You are receiving this because you were mentioned.Message ID: @.***>
As for binary segmentation, num_classes
should be set 1
rather than 2
. You can read 4.1.1 of PRML to know two classes problem.
You can try to modify your config, as configs/_base_/models/fcn_unet_s5-d16.py
for example, like:
# model settings
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained=None,
backbone=dict(
type='UNet',
in_channels=3,
base_channels=64,
num_stages=5,
strides=(1, 1, 1, 1, 1),
enc_num_convs=(2, 2, 2, 2, 2),
dec_num_convs=(2, 2, 2, 2),
downsamples=(True, True, True, True),
enc_dilations=(1, 1, 1, 1, 1),
dec_dilations=(1, 1, 1, 1),
with_cp=False,
conv_cfg=None,
norm_cfg=norm_cfg,
act_cfg=dict(type='ReLU'),
upsample_cfg=dict(type='InterpConv'),
norm_eval=False),
decode_head=dict(
type='FCNHead',
in_channels=64,
in_index=4,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4)),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=256, stride=170))
Hi,
Thanks for the response! I changed num_classes=1 in configs/base/models/fcn_unet_s5-d16.py. And I have to change 'CLASSES' as below to one element only, otherwise it throws error for not matching. It runs, but the segmentation result is always the whole image. My annotated image is attached, where white pixels are my interest, and black is the background. Can you tell what went wrong? Thanks!
Peng
@DATASETS.register_module() class StanfordBackgroundDataset(CustomDataset): CLASSES =('b') # some class name PALETTE =[[255,0,0]] # some color def init(self, split, kwargs): super().init(img_suffix='.png', seg_map_suffix='.png', split=split, kwargs) assert osp.exists(self.img_dir) and self.split is not None
From: MengzhangLI @.> Sent: Tuesday, June 21, 2022 1:37 AM To: open-mmlab/mmsegmentation @.> Cc: Peng Zhao @.>; Mention @.> Subject: Re: [open-mmlab/mmsegmentation] I got same as reported in Lovasz loss #1036 (Issue #1681)
As for binary segmentation, num_classes should be set 1 rather than 2. You can try to modify config from configs/base/models/fcn_unet_s5-d16.py like:
norm_cfg = dict(type='SyncBN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained=None, backbone=dict( type='UNet', in_channels=3, base_channels=64, num_stages=5, strides=(1, 1, 1, 1, 1), enc_num_convs=(2, 2, 2, 2, 2), dec_num_convs=(2, 2, 2, 2), downsamples=(True, True, True, True), enc_dilations=(1, 1, 1, 1, 1), dec_dilations=(1, 1, 1, 1), with_cp=False, conv_cfg=None, norm_cfg=norm_cfg, act_cfg=dict(type='ReLU'), upsample_cfg=dict(type='InterpConv'), norm_eval=False), decode_head=dict( type='FCNHead', in_channels=64, in_index=4, channels=64, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=1, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0)), auxiliary_head=dict( type='FCNHead', in_channels=128, in_index=3, channels=64, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=1, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4)),
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=256, stride=170))
— Reply to this email directly, view it on GitHubhttps://github.com/open-mmlab/mmsegmentation/issues/1681#issuecomment-1161286913, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABZZO2NUIU6G5USF35AZD53VQFIJPANCNFSM5ZHL23OQ. You are receiving this because you were mentioned.Message ID: @.***>
Hi, Thanks for the response! I changed num_classes=1 in configs/base/models/fcn_unet_s5-d16.py. And I have to change 'CLASSES' as below to one element only, otherwise it throws error for not matching. It runs, but the segmentation result is always the whole image. My annotated image is attached, where white pixels are my interest, and black is the background. Can you tell what went wrong? Thanks! Peng @DATASETS.register_module() class StanfordBackgroundDataset(CustomDataset): CLASSES =('b') # some class name PALETTE =[[255,0,0]] # some color def init(self, split, kwargs): super().init(img_suffix='.png', seg_map_suffix='.png', split=split, kwargs) assert osp.exists(self.img_dir) and self.split is not None … ____ From: MengzhangLI @.> Sent: Tuesday, June 21, 2022 1:37 AM To: open-mmlab/mmsegmentation @.> Cc: Peng Zhao @.>; Mention @.> Subject: Re: [open-mmlab/mmsegmentation] I got same as reported in Lovasz loss #1036 (Issue #1681) As for binary segmentation, num_classes should be set 1 rather than 2. You can try to modify config from configs/base/models/fcn_unet_s5-d16.py like: # model settings norm_cfg = dict(type='SyncBN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained=None, backbone=dict( type='UNet', in_channels=3, base_channels=64, num_stages=5, strides=(1, 1, 1, 1, 1), enc_num_convs=(2, 2, 2, 2, 2), dec_num_convs=(2, 2, 2, 2), downsamples=(True, True, True, True), enc_dilations=(1, 1, 1, 1, 1), dec_dilations=(1, 1, 1, 1), with_cp=False, conv_cfg=None, norm_cfg=norm_cfg, act_cfg=dict(type='ReLU'), upsample_cfg=dict(type='InterpConv'), norm_eval=False), decode_head=dict( type='FCNHead', in_channels=64, in_index=4, channels=64, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=1, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0)), auxiliary_head=dict( type='FCNHead', in_channels=128, in_index=3, channels=64, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=1, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4)), # model training and testing settings train_cfg=dict(), test_cfg=dict(mode='slide', crop_size=256, stride=170)) — Reply to this email directly, view it on GitHub<#1681 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABZZO2NUIU6G5USF35AZD53VQFIJPANCNFSM5ZHL23OQ. You are receiving this because you were mentioned.Message ID: @.***>
Sorry for late reply. Could you please try to use dict(type='LoadAnnotations', reduce_zero_label=True),
in train_pipeline
? Looking forward to your reply.
Hi, I run into same problem as the link below, i.e. when using lovasz loss with loss_type='binary', the following problems will occur IndexError: The shape of the mask [....] at index 0 does not match the shape of the indexed tensor [....] at index 0. I noticed that #1036 has been closed. I wonder how it was handled. Thanks!
https://github.com/open-mmlab/mmsegmentation/issues/1036