Closed whikwon closed 5 years ago
You should you argument --validate instead. Pls refer to readme file for details
For validation there are two options. One is to show the loss on validation set, another is to evaluate the mAP on validation set. In mmdetection we adopt the second one and use eval hooks to implement it, see here for details.
If you want to adopt the first option and specify two phases in the workflow, two dataloaders are needed. Two minor modifications are needed.
train_dataset = get_dataset(cfg.data.train)
val_dataset = get_dataset(cfg.data.val)
train_detector(
model,
[train_dataset, val_dataset],
cfg,
distributed=distributed,
validate=args.validate,
logger=logger)
data_loaders = [
build_dataloader(
ds,
cfg.data.imgs_per_gpu,
cfg.data.workers_per_gpu,
dist=True)
for ds in dataset
]
@hellock I tried 1st option and got error:
File "tools/train.py", line 82, in <module>
main()
File "tools/train.py", line 78, in main
logger=logger)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/apis/train.py", line 59, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/apis/train.py", line 107, in _non_dist_train
dist=False)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/datasets/loader/build_loader.py", line 31, in build_dataloader
sampler = GroupSampler(dataset, imgs_per_gpu)
File```
I updated the code snippets above.
I've followed your instruction and got error
train_dataset = get_dataset(cfg.data.train)
val_dataset = get_dataset(cfg.data.val)
train_detector(
model,
[train_dataset, val_dataset],
cfg,
distributed=distributed,
validate=args.validate,
logger=logger)
workflow = [('train', 1), ('val', 1)]
data_loaders = [
build_dataloader(
ds,
cfg.data.imgs_per_gpu,
cfg.data.workers_per_gpu,
dist=True)
for ds in dataset
]
Error:
Traceback (most recent call last):
File "tools/train.py", line 81, in <module>
main()
File "tools/train.py", line 77, in main
logger=logger)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/apis/train.py", line 59, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/apis/train.py", line 109, in _non_dist_train
for ds in dataset
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/apis/train.py", line 109, in <listcomp>
for ds in dataset
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/datasets/loader/build_loader.py", line 31, in build_dataloader
sampler = GroupSampler(dataset, imgs_per_gpu)
File "/home/whikwon/anaconda3/envs/biorobot/lib/python3.6/site-packages/mmdet-0.5.4+c95c637-py3.6.egg/mmdet/datasets/loader/sampler.py", line 14, in __init__
assert hasattr(dataset, 'flag')
AssertionError
I've ran with train dataset and val dataset seperately for train and it worked. (not problem from val dataset.)
How about your config file? Are you setting test_mode=True
for val dataset? It needs to be False just like train dataset.
My config file.
# model settings
model = dict(
type='FasterRCNN',
pretrained='modelzoo://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_scales=[8],
anchor_ratios=[0.5, 1.0, 2.0],
anchor_strides=[4, 8, 16, 32, 64],
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0],
use_sigmoid_cls=True),
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=81,
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2],
reg_class_agnostic=False))
# model training and testing settings
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
smoothl1_beta=1 / 9.0,
debug=False),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False))
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=2000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100)
# soft-nms is also supported for rcnn testing
# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)
)
# dataset settings
dataset_type = 'VOCDataset'
data_root = 'data/VOCdevkit/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
data = dict(
imgs_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=[
data_root + 'VOC2012/ImageSets/Main/train.txt',
],
img_prefix=[data_root + 'VOC2012/'],
img_scale=(1000, 600),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0.5,
with_mask=False,
with_crowd=True,
with_label=True),
val=dict(
type=dataset_type,
ann_file=data_root + 'VOC2012/ImageSets/Main/val.txt',
img_prefix=data_root + 'VOC2012/',
img_scale=(1000, 600),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=False,
with_crowd=True,
with_label=True),
test=dict(
type=dataset_type,
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
img_prefix=data_root + 'VOC2007/',
img_scale=(1000, 600),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=False,
with_label=False,
test_mode=False))
# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
step=[8, 11])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs = 2
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/faster_rcnn_r50_fpn_1x'
load_from = None
resume_from = None
workflow = [('train', 1), ('val', 1)]
valid dataset has no test_mode
key.
How about your config file? Are you setting
test_mode=True
for val dataset? It needs to be False just like train dataset.
Hi!
I'm using the config file for retinanet_r50_fpn_1x
. Isn't test_mode
automatically set as True
on line 25 in mmdet/core/evaluation/eval_hooks.py
for validation? Also, if test_mode=False
, it throws an error, because the image is treated as a tensor, whereas a list of images is expected:
TypeError: imgs must be a list, but got <class 'torch.Tensor'>
I'm not sure what is the difference if I explicitly state test_mode=True
or not in the validation section of my config file.
Please let me know. Thanks!
@dhananjaisharma10 I have the same problem as you Have you solved it?
@hellock
For validation there are two options. One is to show the loss on validation set, another is to evaluate the mAP on validation set. In mmdetection we adopt the second one and use eval hooks to implement it, see here for details.
Now in mmdetection show the loss on validation set. Can I use mAP on validation set? What can I modify ?
How can I see the proceeding status (validation) during training?
I added
workflow = [('train, 1), ('val', 1)]
to config file and executedpython tools/train.py configs/faster_rcnn_r50_fpn_1x.py
Error below occurred
assert len(data_loaders) == len(workflow)
I think handling only
train_dataset
in the tools/train.py script make this error.How can I use validation dataset?