Closed yan-hao-tian closed 2 years ago
Hi @yan-hao-tian
Sorry for late reply.
Could you show your settings of evaluation
field and your training command?
In your 0.14.0 log, the eval_iter_num
field is missing, I want to check whether the evaluation code were executed.
The training command is CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_train.sh configs/deeplabv3/deeplabv3_r50-d8_512x1024_40k_cityscapes.py 2
For the settings of evaluation field, is the entire log ok?
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained='open-mmlab://resnet50_v1c',
backbone=dict(
type='ResNetV1c',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
dilations=(1, 1, 2, 4),
strides=(1, 2, 1, 1),
norm_cfg=dict(type='SyncBN', requires_grad=True),
norm_eval=False,
style='pytorch',
contract_dilation=True),
decode_head=dict(
type='ASPPHead',
in_channels=2048,
in_index=3,
channels=512,
dilations=(1, 12, 24, 36),
dropout_ratio=0.1,
num_classes=19,
norm_cfg=dict(type='SyncBN', requires_grad=True),
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=1024,
in_index=2,
channels=256,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=19,
norm_cfg=dict(type='SyncBN', requires_grad=True),
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
train_cfg=dict(),
test_cfg=dict(mode='whole'))
dataset_type = 'CityscapesDataset'
data_root = 'data/cityscapes/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 1024)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type='CityscapesDataset',
data_root='data/cityscapes/',
img_dir='leftImg8bit/train',
ann_dir='gtFine/train',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=(512, 1024), cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size=(512, 1024), pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]),
val=dict(
type='CityscapesDataset',
data_root='data/cityscapes/',
img_dir='leftImg8bit/val',
ann_dir='gtFine/val',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]),
test=dict(
type='CityscapesDataset',
data_root='data/cityscapes/',
img_dir='leftImg8bit/val',
ann_dir='gtFine/val',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 1024),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]))
log_config = dict(
interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
optimizer_config = dict()
lr_config = dict(policy='poly', power=0.9, min_lr=0.0001, by_epoch=False)
runner = dict(type='IterBasedRunner', max_iters=40000)
checkpoint_config = dict(by_epoch=False, interval=4000)
evaluation = dict(interval=4000, metric='mIoU')
work_dir = './work_dirs/deeplabv3_r50-d8_512x1024_40k_cityscapes'
gpu_ids = range(0, 1)
Hi @yan-hao-tian, The MMSegmentation 1.x has been released, this problem is solved.
Will close the issue, as there is no activity for a while. We hope your issue has been resolved. If not, please feel free to open a new one.
I am running a customized model with this awesome repo. My problem is that I have run it on two machines, but the performance difference is really big(5%). Through a careful comparison, I believe that the network architecture, the training setting and the data are all the same in these two training courses, and the version of the mmsegmentation is different (one is 0.12.0+adca7b9 another is 0.14.0+5d46314). However, I find a little gap in their logs. In each validation, the 0.12.0 log shows that in a manner Iter(val) [28000]
mIoU: 0.7254, mAcc: 0.8161, aAcc: 0.9539, IoU.road: 0.9795, IoU.sidewalk: 0.8321, IoU.building: 0.9145, IoU.wall: 0.4059, IoU.fence: 0.5777, IoU.pole: 0.6191, IoU.traffic light: 0.6997, IoU.traffic sign: 0.7788, IoU.vegetation: 0.9139, IoU.terrain: 0.5787, IoU.sky: 0.9350, IoU.person: 0.7334, IoU.rider: 0.4358, IoU.car: 0.9459, IoU.truck: 0.6914, IoU.bus: 0.8188, IoU.train: 0.5697, IoU.motorcycle: 0.5911, IoU.bicycle: 0.7610, Acc.road: 0.9872, Acc.sidewalk: 0.9331, Acc.building: 0.9582, Acc.wall: 0.4476, Acc.fence: 0.7119, Acc.pole: 0.7431, Acc.traffic light: 0.8470, Acc.traffic sign: 0.8604, Acc.vegetation: 0.9662, Acc.terrain: 0.6621, Acc.sky: 0.9660, Acc.person: 0.8298, Acc.rider: 0.8479, Acc.car: 0.9748, Acc.truck: 0.7268, Acc.bus: 0.8473, Acc.train: 0.5773, Acc.motorcycle: 0.7610, Acc.bicycle: 0.8574
In contrast, the 0.14.0 log shows that Iter [500/80000] lr: 5.405e-03, eta: 16:21:04, time: 1.354, data_time: 0.016, memory: 23274, aAcc: 0.9440,mIoU: 0.6553, mAcc: 0.7293, IoU.road: 0.9750, IoU.sidewalk: 0.7975, IoU.building: 0.8978, IoU.wall: 0.3486, IoU.fence: 0.4156, IoU.pole: 0.4608, IoU.traffic light: 0.6247, IoU.traffic sign: 0.7361, IoU.vegetation: 0.8999, IoU.terrain: 0.5254, IoU.sky: 0.9263, IoU.person: 0.7639, IoU.rider: 0.5327, IoU.car: 0.9082, IoU.truck: 0.4573, IoU.bus: 0.5918, IoU.train: 0.3723, IoU.motorcycle: 0.4882, IoU.bicycle: 0.7293, Acc.road: 0.9862, Acc.sidewalk: 0.8986, Acc.building: 0.9649, Acc.wall: 0.3756, Acc.fence: 0.4411, Acc.pole: 0.5404, Acc.traffic light: 0.7365, Acc.traffic sign: 0.8005, Acc.vegetation: 0.9608, Acc.terrain: 0.5934, Acc.sky: 0.9735, Acc.person: 0.8987, Acc.rider: 0.6601, Acc.car: 0.9760, Acc.truck: 0.4810, Acc.bus: 0.6360, Acc.train: 0.3780, Acc.motorcycle: 0.7376, Acc.bicycle: 0.8183, decode.loss_seg: 0.1664, decode.acc_seg: 89.6364, aux.loss_seg: 0.1033, aux.acc_seg: 86.9771, loss: 0.2697
The bold part shows the difference(each validation in 0.14.0 uses the No.500 weights), which also exists in the log.json file. 0.14.0{"mode": "train", "epoch": 98, "iter": 500, "lr": 0.00135, "memory": 23289, "aAcc": 0.953, "mIoU": 0.7449, "mAcc": 0.8197, "IoU.road": 0.9779, "IoU.sidewalk": 0.8264, "IoU.building": 0.9115, "IoU.wall": 0.4454, "IoU.fence": 0.5855, "IoU.pole": 0.4794, "IoU.traffic light": 0.6638, "IoU.traffic sign": 0.7489, "IoU.vegetation": 0.9104, "IoU.terrain": 0.6164, "IoU.sky": 0.9332, "IoU.person": 0.7783, "IoU.rider": 0.5789, "IoU.car": 0.9402, "IoU.truck": 0.7853, "IoU.bus": 0.8431, "IoU.train": 0.7209, "IoU.motorcycle": 0.6521, "IoU.bicycle": 0.7561, "Acc.road": 0.9874, "Acc.sidewalk": 0.9095, "Acc.building": 0.9633, "Acc.wall": 0.4982, "Acc.fence": 0.6653, "Acc.pole": 0.5509, "Acc.traffic light": 0.7726, "Acc.traffic sign": 0.8229, "Acc.vegetation": 0.968, "Acc.terrain": 0.6844, "Acc.sky": 0.9658, "Acc.person": 0.9114, "Acc.rider": 0.7198, "Acc.car": 0.9772, "Acc.truck": 0.8697, "Acc.bus": 0.8978, "Acc.train": 0.7599, "Acc.motorcycle": 0.7674, "Acc.bicycle": 0.8831, "data_time": 0.6854, "decode.loss_seg": 0.12487, "decode.acc_seg": 89.96166, "aux.loss_seg": 0.07605, "aux.acc_seg": 88.0737, "loss": 0.20093, "time": 2.21966}
0.12.0{"mode": "val", "epoch": 33, "iter": 12000, "lr": 5e-05, "mIoU": 0.4241, "mAcc": 0.524, "aAcc": 0.8841, "IoU.road": 0.9151, "IoU.sidewalk": 0.6107, "IoU.building": 0.784, "IoU.wall": 0.2785, "IoU.fence": 0.2254, "IoU.pole": 0.2819, "IoU.traffic light": 0.047, "IoU.traffic sign": 0.3331, "IoU.vegetation": 0.8538, "IoU.terrain": 0.4476, "IoU.sky": 0.8661, "IoU.person": 0.4775, "IoU.rider": 0.0895, "IoU.car": 0.8109, "IoU.truck": 0.0517, "IoU.bus": 0.3099, "IoU.train": 0.1933, "IoU.motorcycle": 0.03, "IoU.bicycle": 0.4519, "Acc.road": 0.934, "Acc.sidewalk": 0.8033, "Acc.building": 0.907, "Acc.wall": 0.3544, "Acc.fence": 0.2874, "Acc.pole": 0.3629, "Acc.traffic light": 0.0478, "Acc.traffic sign": 0.3821, "Acc.vegetation": 0.9434, "Acc.terrain": 0.5618, "Acc.sky": 0.9637, "Acc.person": 0.7612, "Acc.rider": 0.0988, "Acc.car": 0.9088, "Acc.truck": 0.0532, "Acc.bus": 0.4091, "Acc.train": 0.3907, "Acc.motorcycle": 0.0306, "Acc.bicycle": 0.7553}
I can solve the problem by using the 0.12.0 version all time. But I would still like to know how to use the 0.14.0 to validate correctly? Or even there is no problem in the validation of 0.14.0? Or I use a truly wrong version unfortunately?
Thanks a lot.