open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.56k stars 9.46k forks source link

How to use test.py to perform inference without ann_file #2167

Closed CMobley7 closed 4 years ago

CMobley7 commented 4 years ago

So, I'm having trouble configuring test.py to output inference results to a .pkl without specifying an ann_file. However, I don't have an ann_file. I merely want to perform inference. The error I'm getting is included below, as well as my config file. How can I use test.py to just perform inference?

Traceback (most recent call last):
   File "/home/$USER/src/mmdet/tools/test.py", line 316, in <module>
 Traceback (most recent call last):
   File "/home/$USER/src/mmdet/tools/test.py", line 316, in <module>
     main()
   File "/home/$USER/src/mmdet/tools/test.py", line 271, in main
     dataset = build_dataset(cfg.data.test)
   File "/home/$USER/src/mmdet/mmdet/datasets/builder.py", line 36, in build_dataset
     main()
   File "/home/$USER/src/mmdet/tools/test.py", line 271, in main
     elif isinstance(cfg['ann_file'], (list, tuple)):
   File "/usr/local/lib/python3.6/dist-packages/mmcv/utils/config.py", line 16, in __missing__
     raise KeyError(name)
 KeyError: 'ann_file'
     dataset = build_dataset(cfg.data.test)
   File "/home/$USER/src/mmdet/mmdet/datasets/builder.py", line 36, in build_dataset
     elif isinstance(cfg['ann_file'], (list, tuple)):
   File "/usr/local/lib/python3.6/dist-packages/mmcv/utils/config.py", line 16, in __missing__
     raise KeyError(name)
 KeyError: 'ann_file'
# model settings
model = dict(
    type='MaskRCNN',
    pretrained='torchvision://resnet50',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        style='pytorch'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_scales=[8],
        anchor_ratios=[0.5, 1.0, 2.0],
        anchor_strides=[4, 8, 16, 32, 64],
        target_means=[.0, .0, .0, .0],
        target_stds=[1.0, 1.0, 1.0, 1.0],
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
    bbox_roi_extractor=dict(
        type='SingleRoIExtractor',
        roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
        out_channels=256,
        featmap_strides=[4, 8, 16, 32]),
    bbox_head=dict(
        type='SharedFCBBoxHead',
        num_fcs=2,
        in_channels=256,
        fc_out_channels=1024,
        roi_feat_size=7,
        num_classes=81,
        target_means=[0., 0., 0., 0.],
        target_stds=[0.1, 0.1, 0.2, 0.2],
        reg_class_agnostic=False,
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
        loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
    mask_roi_extractor=dict(
        type='SingleRoIExtractor',
        roi_layer=dict(type='RoIAlign', out_size=14, sample_num=2),
        out_channels=256,
        featmap_strides=[4, 8, 16, 32]),
    mask_head=dict(
        type='FCNMaskHead',
        num_convs=4,
        in_channels=256,
        conv_out_channels=256,
        num_classes=81,
        loss_mask=dict(
            type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)))
# model training and testing settings
train_cfg = dict(
    rpn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False),
        allowed_border=0,
        pos_weight=-1,
        debug=False),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=2000,
        max_num=2000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True),
        mask_size=28,
        pos_weight=-1,
        debug=False))
test_cfg = dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05,
        nms=dict(type='nms', iou_thr=0.5),
        max_per_img=100,
        mask_thr_binary=0.5))
# dataset settings
dataset_type = 'CocoDataset'
data_root = '/podc/data/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=2,
    test=dict(
        type=dataset_type,
        img_prefix=data_root + 'coco_2014/images/val2014/',
        pipeline=test_pipeline))
# optimizer
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=1.0 / 3,
    step=[8, 11])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])
# yapf:enable
evaluation = dict(interval=1)
# runtime settings
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/mask_rcnn_r50_fpn_1x'
load_from = None
resume_from = None
workflow = [('test', 1)]
Liyw979 commented 4 years ago

You can try creating a new dataset class inherited from CustomDataset and override the load_annotations method. example

    def load_annotations(self, ann_file):
        del ann_file  # not required
        img_infos = []

        #modify this according to your path
        paths = list(Path(self.img_prefix)
                     .joinpath('image').glob('*')) 

        paths.sort(key=lambda p: p.name)

        for img_path in paths:
            img_id = img_path.name.split('.')[0]
            filename = 'image/' + img_path.name
            height, width = mmcv.imread(str(img_path)).shape[:2]
            img_infos.append(
                dict(id=img_id, filename=filename,
                     width=width, height=height))

        print(f'dataset size: {len(img_infos)}')

        return img_infos
hellock commented 4 years ago

If you do don't have an ann_file, how is the file list obtained? Even if you do not have the gt annotation, there is still a ann_file, which indicates the file list, such as the COCO test-dev set.

CMobley7 commented 4 years ago

@hellock, my apologies for my ignorance, but say I've trained an object detector with your library. Now I want to use said object detector to run inference on a directory of images that possibly contain the objects which I trained the object detector to detect. I don't have an ann_file for this directory. I'm merely trying to produce a .json file with the detection for each image in the directory. I was trying to use test.py for this, but it doesn't seem built for that. Is there anything script? or will I have to write my own? If I have to write my own, can you point me to any examples? Thanks in advance.

panjianning commented 4 years ago

@hellock, my apologies for my ignorance, but say I've trained an object detector with your library. Now I want to use said object detector to run inference on a directory of images that possibly contain the objects which I trained the object detector to detect. I don't have an ann_file for this directory. I'm merely trying to produce a .json file with the detection for each image in the directory. I was trying to use test.py for this, but it doesn't seem built for that. Is there anything script? or will I have to write my own? If I have to write my own, can you point me to any examples? Thanks in advance.

Suppose your test images are /data/images/1.jpg1 and /data/images/2.jpg, you can create a test annotation like:

{'categories': ['your categories in training annotations'],
 'annotations': [],
 'images': [{'file_name': '/data/images/1.jpg', 'id': 1},
  {'file_name': '/data/images/2.jpg', 'id': 2}]}

and config like

test=dict(
        type=dataset_type,
        ann_file='your ann path',
        img_prefix=None,
        pipeline=test_pipeline)
CMobley7 commented 4 years ago

@PanJianning, thank you that worked. However, I do have one quick question for you, @hellock, or @liyiwei979621500. test.py produces a .pkl file? After unpickling, it appears to be list of tuples of arrays, but of what I cannot determine. I was expecting a dict of bbox or segm detections for the images, like a traditional ann file. Is getting this possible?

panjianning commented 4 years ago

@PanJianning, thank you that worked. However, I do have one quick question for you, @hellock, or @liyiwei979621500. test.py produces a .pkl file? After unpickling, it appears to be list of tuples of arrays, but of what I cannot determine. I was expecting a dict of bbox or segm detections for the images, like a traditional ann file. Is getting this possible?

You can use the test.py below, using the --json_out arg will give a bbox json output. Seems this feature is removed from the neweat version.

import argparse
import os
import os.path as osp
import pickle
import shutil
import tempfile

import mmcv
import torch
import torch.distributed as dist
from mmcv.parallel import MMDataParallel, MMDistributedDataParallel
from mmcv.runner import get_dist_info, init_dist, load_checkpoint

from mmdet.core import wrap_fp16_model
from mmdet.datasets import build_dataloader, build_dataset
from mmdet.models import build_detector

def single_gpu_test(model, data_loader, show=False):
    model.eval()
    results = []
    dataset = data_loader.dataset
    prog_bar = mmcv.ProgressBar(len(dataset))
    for i, data in enumerate(data_loader):
        with torch.no_grad():
            result = model(return_loss=False, rescale=not show, **data)
        results.append(result)

        if show:
            model.module.show_result(data, result)

        batch_size = data['img'][0].size(0)
        for _ in range(batch_size):
            prog_bar.update()
    return results

def multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False):
    """Test model with multiple gpus.
    This method tests model with multiple gpus and collects the results
    under two different modes: gpu and cpu modes. By setting 'gpu_collect=True'
    it encodes results to gpu tensors and use gpu communication for results
    collection. On cpu mode it saves the results on different gpus to 'tmpdir'
    and collects them by the rank 0 worker.
    Args:
        model (nn.Module): Model to be tested.
        data_loader (nn.Dataloader): Pytorch data loader.
        tmpdir (str): Path of directory to save the temporary results from
            different gpus under cpu mode.
        gpu_collect (bool): Option to use either gpu or cpu to collect results.
    Returns:
        list: The prediction results.
    """
    model.eval()
    results = []
    dataset = data_loader.dataset
    rank, world_size = get_dist_info()
    if rank == 0:
        prog_bar = mmcv.ProgressBar(len(dataset))
    for i, data in enumerate(data_loader):
        with torch.no_grad():
            result = model(return_loss=False, rescale=True, **data)
        results.append(result)

        if rank == 0:
            batch_size = data['img'][0].size(0)
            for _ in range(batch_size * world_size):
                prog_bar.update()

    # collect results from all ranks
    if gpu_collect:
        results = collect_results_gpu(results, len(dataset))
    else:
        results = collect_results_cpu(results, len(dataset), tmpdir)
    return results

def collect_results_cpu(result_part, size, tmpdir=None):
    rank, world_size = get_dist_info()
    # create a tmp dir if it is not specified
    if tmpdir is None:
        MAX_LEN = 512
        # 32 is whitespace
        dir_tensor = torch.full((MAX_LEN, ),
                                32,
                                dtype=torch.uint8,
                                device='cuda')
        if rank == 0:
            tmpdir = tempfile.mkdtemp()
            tmpdir = torch.tensor(
                bytearray(tmpdir.encode()), dtype=torch.uint8, device='cuda')
            dir_tensor[:len(tmpdir)] = tmpdir
        dist.broadcast(dir_tensor, 0)
        tmpdir = dir_tensor.cpu().numpy().tobytes().decode().rstrip()
    else:
        mmcv.mkdir_or_exist(tmpdir)
    # dump the part result to the dir
    mmcv.dump(result_part, osp.join(tmpdir, 'part_{}.pkl'.format(rank)))
    dist.barrier()
    # collect all parts
    if rank != 0:
        return None
    else:
        # load results of all parts from tmp dir
        part_list = []
        for i in range(world_size):
            part_file = osp.join(tmpdir, 'part_{}.pkl'.format(i))
            part_list.append(mmcv.load(part_file))
        # sort the results
        ordered_results = []
        for res in zip(*part_list):
            ordered_results.extend(list(res))
        # the dataloader may pad some samples
        ordered_results = ordered_results[:size]
        # remove tmp dir
        shutil.rmtree(tmpdir)
        return ordered_results

def collect_results_gpu(result_part, size):
    rank, world_size = get_dist_info()
    # dump result part to tensor with pickle
    part_tensor = torch.tensor(
        bytearray(pickle.dumps(result_part)), dtype=torch.uint8, device='cuda')
    # gather all result part tensor shape
    shape_tensor = torch.tensor(part_tensor.shape, device='cuda')
    shape_list = [shape_tensor.clone() for _ in range(world_size)]
    dist.all_gather(shape_list, shape_tensor)
    # padding result part tensor to max length
    shape_max = torch.tensor(shape_list).max()
    part_send = torch.zeros(shape_max, dtype=torch.uint8, device='cuda')
    part_send[:shape_tensor[0]] = part_tensor
    part_recv_list = [
        part_tensor.new_zeros(shape_max) for _ in range(world_size)
    ]
    # gather all result part
    dist.all_gather(part_recv_list, part_send)

    if rank == 0:
        part_list = []
        for recv, shape in zip(part_recv_list, shape_list):
            part_list.append(
                pickle.loads(recv[:shape[0]].cpu().numpy().tobytes()))
        # sort the results
        ordered_results = []
        for res in zip(*part_list):
            ordered_results.extend(list(res))
        # the dataloader may pad some samples
        ordered_results = ordered_results[:size]
        return ordered_results

class MultipleKVAction(argparse.Action):
    """
    argparse action to split an argument into KEY=VALUE form
    on the first = and append to a dictionary.
    """

    def _is_int(self, val):
        try:
            _ = int(val)
            return True
        except Exception:
            return False

    def _is_float(self, val):
        try:
            _ = float(val)
            return True
        except Exception:
            return False

    def _is_bool(self, val):
        return val.lower() in ['true', 'false']

    def __call__(self, parser, namespace, values, option_string=None):
        options = {}
        for val in values:
            parts = val.split('=')
            key = parts[0].strip()
            if len(parts) > 2:
                val = '='.join(parts[1:])
            else:
                val = parts[1].strip()
            # try parsing val to bool/int/float first
            if self._is_bool(val):
                import json
                val = json.loads(val.lower())
            elif self._is_int(val):
                val = int(val)
            elif self._is_float(val):
                val = float(val)
            options[key] = val
        setattr(namespace, self.dest, options)

def parse_args():
    parser = argparse.ArgumentParser(
        description='MMDet test (and eval) a model')
    parser.add_argument('config', help='test config file path')
    parser.add_argument('checkpoint', help='checkpoint file')
    parser.add_argument('--out', help='output result file in pickle format')
    parser.add_argument(
        '--eval',
        type=str,
        nargs='+',
        help='evaluation metrics, which depends on the dataset, e.g., "bbox",'
        ' "segm", "proposal" for COCO, and "mAP", "recall" for PASCAL VOC')
    parser.add_argument('--show', action='store_true', help='show results')
    parser.add_argument(
        '--gpu_collect',
        action='store_true',
        help='whether to use gpu to collect results.')
    parser.add_argument(
        '--tmpdir',
        help='tmp directory used for collecting results from multiple '
        'workers, available when gpu_collect is not specified')
    parser.add_argument(
        '--options', nargs='+', action=MultipleKVAction, help='custom options')
    parser.add_argument(
        '--launcher',
        choices=['none', 'pytorch', 'slurm', 'mpi'],
        default='none',
        help='job launcher')
    parser.add_argument('--local_rank', type=int, default=0)
    parser.add_argument('--json_out', type=str, default=None)
    args = parser.parse_args()
    if 'LOCAL_RANK' not in os.environ:
        os.environ['LOCAL_RANK'] = str(args.local_rank)
    return args

def main():
    args = parse_args()

    # assert args.out or args.eval or args.show, \
    #     ('Please specify at least one operation (save or eval or show the '
    #      'results) with the argument "--out", "--eval" or "--show"')
    #
    # if args.out is not None and not args.out.endswith(('.pkl', '.pickle')):
    #     raise ValueError('The output file must be a pkl file.')

    cfg = mmcv.Config.fromfile(args.config)
    # set cudnn_benchmark
    if cfg.get('cudnn_benchmark', False):
        torch.backends.cudnn.benchmark = True
    cfg.model.pretrained = None
    cfg.data.test.test_mode = True

    # init distributed env first, since logger depends on the dist info.
    if args.launcher == 'none':
        distributed = False
    else:
        distributed = True
        init_dist(args.launcher, **cfg.dist_params)

    # build the dataloader
    # TODO: support multiple images per gpu (only minor changes are needed)
    dataset = build_dataset(cfg.data.test)
    data_loader = build_dataloader(
        dataset,
        imgs_per_gpu=1,
        workers_per_gpu=cfg.data.workers_per_gpu,
        dist=distributed,
        shuffle=False)

    # build the model and load checkpoint
    model = build_detector(cfg.model, train_cfg=None, test_cfg=cfg.test_cfg)
    fp16_cfg = cfg.get('fp16', None)
    if fp16_cfg is not None:
        wrap_fp16_model(model)
    checkpoint = load_checkpoint(model, args.checkpoint, map_location='cpu')
    # old versions did not save class info in checkpoints, this walkaround is
    # for backward compatibility
    if 'CLASSES' in checkpoint['meta']:
        model.CLASSES = checkpoint['meta']['CLASSES']
    else:
        model.CLASSES = dataset.CLASSES

    if not distributed:
        model = MMDataParallel(model, device_ids=[0])
        outputs = single_gpu_test(model, data_loader, args.show)
    else:
        model = MMDistributedDataParallel(model.cuda())
        outputs = multi_gpu_test(model, data_loader, args.tmpdir,
                                 args.gpu_collect)

    rank, _ = get_dist_info()
    if rank == 0:
        if args.out:
            print('\nwriting results to {}'.format(args.out))
            mmcv.dump(outputs, args.out)
        if args.eval:
            kwargs = {} if args.options is None else args.options
            dataset.evaluate(outputs, args.eval, **kwargs)
        if args.json_out:
            dataset.results2json(outputs, args.json_out)

if __name__ == '__main__':
    main()
CMobley7 commented 4 years ago

Thanks @PanJianning. In the updated test.py, they include an argument --format_only, which does the same thing as json_out except they don't allow the ability to set jsonfile_prefix for results2json. I'm going to put in a request for this feature. Thanks for your help everyone.

Ixiaohuihuihui commented 4 years ago

@CMobley7 Hi, I have the same question, but I don't know how to create a test annotation, can you share some code with me?

bnumaomei commented 4 years ago

@hellock

If you do don't have an ann_file, how is the file list obtained? Even if you do not have the gt annotation, there is still a ann_file, which indicates the file list, such as the COCO test-dev set.

Hi, coco data need an ann_file, how about voc data? The file list is in the ImageSets/Main/test.txt Why still need xml file ?

CMobley7 commented 4 years ago

@Ixiaohuihuihui ,

Sorry for not getting back to you sooner.

This is the code I use to generate the ann_file

filenames = []
file_extensions = ["*.jpg", "*.jpeg", "*.png", "*.tif", "*.gif"]
for extension in file_extensions:
    for filename in Path("{}/{}".format(DATA_ROOT, VAL)).glob(extension):
        filenames.append(str(filename))

images = [
    {"file_name": filename, "id": int(i)}
    for i, filename in enumerate(filenames, start=1)
]

ann_file = {
    "categories": OUTPUT_CATEGORIES,
    "annotations": [],
    "images": images,
}

where OUTPUT_CATEGORIES is

coco = COCO("{}{}".format(DATA_ROOT, TRAIN_ANN))
OUTPUT_CATEGORIES = coco.loadCats(coco.getCatIds())

Hope this helps.

CMobley7 commented 4 years ago

@PanJianning and @hellock ,

For some reason this method only works under 1.x, but not under 2.0. I've updated to 2.0 convention listed here. So, I instead of dataset_type = 'MyDataset', it's now dataset_type = 'CocoDataset' and classes = (class1, class2, etc.) as you can see in my configuration file below. In addition, I updated the code in which I use to produce the ann_file to the following.

images_info = []
file_extensions = ["*.jpg", "*.jpeg", "*.png", "*.tif", "*.gif"]
for extension in file_extensions:
    for filename in Path("{}/{}".format(DATA_ROOT, VAL)).glob(extension):
        image = Image.open(filename)
        width, height = image.size
        images_info.append([str(filename), int(height), int(width)])

images = [
    {
        "file_name": image_info[0],
        "height": image_info[1],
        "width": image_info[2],
        "id": i,
    }
    for i, image_info in enumerate(images_info, start=1)
]

ann_file = {
    "categories": OUTPUT_CATEGORIES,
    "annotations": [],
    "images": images,
}

So, I'm no longer using a custom, my_dataset.py script, which looked like

import sys

from .coco import CocoDataset
from .registry import DATASETS

sys.path.insert(1, "/podc")
from config import CATEGORIES

if CATEGORIES is not None:

    @DATASETS.register_module
    class MyDataset(CocoDataset):
        CLASSES = CATEGORIES

I can successfully train a model using tools/train.py and evaluate a model using tools/test.py. However, I cannot perform inference. I've compared the 1.2 code base to 2.0 and can't seem to find any major differences that could account for this besides those I've mentioned above. Do, I still need a custom my_dataset.py file? The documentation made it seem as if I didn't.

model=dict(
    type='MaskRCNN',
    pretrained='torchvision://resnet50',
    backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(
            type='BN',
            requires_grad=True),
        norm_eval=True,
        style='pytorch'),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss',
            use_sigmoid=True,
            loss_weight=1.0),
        loss_bbox=dict(
            type='L1Loss',
            loss_weight=1.0)),
    roi_head=dict(
        type='StandardRoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(
                type='RoIAlign',
                out_size=7,
                sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=80,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0.0, 0.0, 0.0, 0.0],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss',
                use_sigmoid=False,
                loss_weight=1.0),
            loss_bbox=dict(
                type='L1Loss',
                loss_weight=1.0)),
        mask_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(
                type='RoIAlign',
                out_size=14,
                sample_num=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        mask_head=dict(
            type='FCNMaskHead',
            num_convs=4,
            in_channels=256,
            conv_out_channels=256,
            num_classes=80,
            loss_mask=dict(
                type='CrossEntropyLoss',
                use_mask=True,
                loss_weight=1.0))))
train_cfg=dict(
    rpn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.7,
            neg_iou_thr=0.3,
            min_pos_iou=0.3,
            match_low_quality=True,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=256,
            pos_fraction=0.5,
            neg_pos_ub=-1,
            add_gt_as_proposals=False),
        allowed_border=-1,
        pos_weight=-1,
        debug=False),
    rpn_proposal=dict(
        nms_across_levels=False,
        nms_pre=2000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        assigner=dict(
            type='MaxIoUAssigner',
            pos_iou_thr=0.5,
            neg_iou_thr=0.5,
            min_pos_iou=0.5,
            match_low_quality=True,
            ignore_iof_thr=-1),
        sampler=dict(
            type='RandomSampler',
            num=512,
            pos_fraction=0.25,
            neg_pos_ub=-1,
            add_gt_as_proposals=True),
        mask_size=28,
        pos_weight=-1,
        debug=False))
test_cfg=dict(
    rpn=dict(
        nms_across_levels=False,
        nms_pre=1000,
        nms_post=1000,
        max_num=1000,
        nms_thr=0.7,
        min_bbox_size=0),
    rcnn=dict(
        score_thr=0.05,
        nms=dict(
            type='nms',
            iou_thr=0.5),
        max_per_img=100,
        mask_thr_binary=0.5))
dataset_type='CocoDataset'
data_root='/podc/data/'
img_norm_cfg=dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True)
train_pipeline=[
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations',
        with_bbox=True,
        with_mask=True),
    dict(type='Resize',
        img_scale=(1333, 800),
        keep_ratio=True),
    dict(type='RandomFlip',
        flip_ratio=0.5),
    dict(type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad',
        size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect',
        keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])]
test_pipeline=[
    dict(type='LoadImageFromFile'),
    dict(type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize',
                keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad',
                size_divisor=32),
            dict(type='ImageToTensor',
                keys=['img']),
            dict(type='Collect',
                keys=['img'])])]
data=dict(
    samples_per_gpu=8,
    workers_per_gpu=2,
    train=dict(
        type='CocoDataset',
        ann_file='data/coco/annotations/instances_train2017.json',
        img_prefix='data/coco/train2017/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations',
                with_bbox=True,
                with_mask=True),
            dict(type='Resize',
                img_scale=(1333, 800),
                keep_ratio=True),
            dict(type='RandomFlip',
                flip_ratio=0.5),
            dict(type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad',
                size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect',
                keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])]),
    val=dict(
        type='CocoDataset',
        ann_file='/podc/shared/mask_rcnn_r50_fpn_1x_coco_20200515_141801_infer_ann.json',
        img_prefix='',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize',
                        keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad',
                        size_divisor=32),
                    dict(type='ImageToTensor',
                        keys=['img']),
                    dict(type='Collect',
                        keys=['img'])])],
        classes=('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')),
    test=dict(
        type='CocoDataset',
        ann_file='/podc/shared/mask_rcnn_r50_fpn_1x_coco_20200515_141801_infer_ann.json',
        img_prefix='',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='MultiScaleFlipAug',
                img_scale=(1333, 800),
                flip=False,
                transforms=[
                    dict(type='Resize',
                        keep_ratio=True),
                    dict(type='RandomFlip'),
                    dict(type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad',
                        size_divisor=32),
                    dict(type='ImageToTensor',
                        keys=['img']),
                    dict(type='Collect',
                        keys=['img'])])],
        classes=('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')))
evaluation=dict(
    interval=1,
    metric=['bbox', 'segm'])
optimizer=dict(
    type='SGD',
    lr=0.02,
    momentum=0.9,
    weight_decay=0.0001)
optimizer_config=dict(
    grad_clip=None)
lr_config=dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[8, 11])
total_epochs=12
checkpoint_config=dict(
    interval=1)
log_config=dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook')])
dist_params=dict(
    backend='nccl')
log_level='INFO'
load_from=None
resume_from=None
workflow=[('test', 1)]
classes=('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush')
CMobley7 commented 4 years ago

Had to revert back to using custom MyDataset class as indicated above.

Keiku commented 2 years ago

@CMobley7 I would like to do the same, but I get an error except as shown below.

    test=dict(
        type='CocoDataset',
        ann_file='data/tiny_coco/annotations/instances_val2017.json',
        img_prefix='data/tiny_coco/val2017/',
        ....

How can I apply it to the custom dataset created by build_dataset only for test? I want to get results for an image folder without annotation files (specify img_prefix only).

bnumaomei commented 2 years ago

这是来自徐杨的自动回复邮件。