Open ashnair1 opened 5 years ago
@ash1995 can you tell me how to do inference on a set of images using main_test.py(show me the code)? 3s per image is acceptable for me now. On my test on NVIDIA 1080Ti, Inference per image needs about 10s.
Here you go,
# --------------------------------------------------------------
# SNIPER: Efficient Multi-Scale Training
# Licensed under The Apache-2.0 License [see LICENSE for details]
# Inference Module
# by Mahyar Najibi and Bharat Singh
# --------------------------------------------------------------
import init
import matplotlib
matplotlib.use('Agg')
from symbols.faster import *
from configs.faster.default_configs import config, update_config, update_config_from_list
from data_utils.load_data import load_proposal_roidb
import mxnet as mx
import argparse
from train_utils.utils import create_logger, load_param
from inference import imdb_detection_wrapper
from inference import imdb_proposal_extraction_wrapper
import os
import time
from PIL import Image
from easydict import EasyDict
os.environ['MXNET_CUDNN_AUTOTUNE_DEFAULT'] = '0'
def parser():
arg_parser = argparse.ArgumentParser('SNIPER test module')
arg_parser.add_argument('--cfg', dest='cfg', help='Path to the config file',
default='configs/faster/sniper_res101_e2e.yml',type=str)
arg_parser.add_argument('--save_prefix', dest='save_prefix', help='Prefix used for snapshotting the network',
default='SNIPER', type=str)
arg_parser.add_argument('--img_dir_path', dest='img_dir_path', help='Path to the image directory', type=str,
default='data/demo_batch/images')
arg_parser.add_argument('--dataset', dest='dataset', help='choose categories belong to which dataset', type=str,
default='coco')
arg_parser.add_argument('--vis', dest='vis', help='Whether to visualize the detections',
action='store_true')
arg_parser.add_argument('--set', dest='set_cfg_list', help='Set the configuration fields from command line',
default=None, nargs=argparse.REMAINDER)
return arg_parser.parse_args()
def main():
args = parser()
update_config(args.cfg)
if args.set_cfg_list:
update_config_from_list(args.set_cfg_list)
context = [mx.gpu(int(gpu)) for gpu in config.gpus.split(',')]
if not os.path.isdir(config.output_path):
os.mkdir(config.output_path)
coco_names = [u'BG', u'person', u'bicycle', u'car', u'motorcycle', u'airplane',
u'bus', u'train', u'truck', u'boat', u'traffic light', u'fire hydrant',
u'stop sign', u'parking meter', u'bench', u'bird', u'cat', u'dog', u'horse', u'sheep', u'cow',
u'elephant', u'bear', u'zebra', u'giraffe', u'backpack', u'umbrella', u'handbag', u'tie',
u'suitcase', u'frisbee', u'skis', u'snowboard', u'sports\nball', u'kite', u'baseball\nbat',
u'baseball glove', u'skateboard', u'surfboard', u'tennis racket', u'bottle', u'wine\nglass',
u'cup', u'fork', u'knife', u'spoon', u'bowl', u'banana', u'apple', u'sandwich', u'orange',
u'broccoli', u'carrot', u'hot dog', u'pizza', u'donut', u'cake', u'chair', u'couch',
u'potted plant', u'bed', u'dining table', u'toilet', u'tv', u'laptop', u'mouse', u'remote',
u'keyboard', u'cell phone', u'microwave', u'oven', u'toaster', u'sink', u'refrigerator', u'book',
u'clock', u'vase', u'scissors', u'teddy bear', u'hair\ndrier', u'toothbrush']
roidb = []
for img in os.listdir(args.img_dir_path):
start = time.time()
im_path = os.path.join(args.img_dir_path,img)
# Get image dimensions
width, height = Image.open(im_path).size
# Pack image info
r = {'image': im_path, 'width': width, 'height': height, 'flipped': False}
roidb.append(r)
# Pack db info
db_info = EasyDict()
db_info.name = 'satellite'
db_info.result_path = 'data/demo/batch_results'
db_info.classes = satellite_names
db_info.num_classes = len(db_info.classes)
imdb = db_info
# Creating the Logger
logger, output_path = create_logger(config.output_path, args.cfg, config.dataset.image_set)
model_prefix = os.path.join(output_path, args.save_prefix)
arg_params, aux_params = load_param(model_prefix, config.TEST.TEST_EPOCH,
convert=True, process=True)
sym_inst = eval('{}.{}'.format(config.symbol, config.symbol))
imdb_detection_wrapper(sym_inst, config, imdb, roidb, context, arg_params, aux_params, args.vis)
if __name__ == '__main__':
main()
@ash1995 many thanks!
me too
GPU M40 picture size: 100K
inference time cost 4s, very slowly, how to speed up it ?
@ash1995 hello, I have a question when i run the main_test.py.
Tester: 10108/10144, Detection: 0.1189s, Post Processing: 0.008175s Tester: 10112/10144, Detection: 0.1189s, Post Processing: 0.008174s Tester: 10116/10144, Detection: 0.1189s, Post Processing: 0.008172s Tester: 10120/10144, Detection: 0.1189s, Post Processing: 0.008169s Tester: 10124/10144, Detection: 0.1188s, Post Processing: 0.008167s Tester: 10128/10144, Detection: 0.1188s, Post Processing: 0.008167s Tester: 10132/10144, Detection: 0.1188s, Post Processing: 0.008166s Tester: 10136/10144, Detection: 0.1188s, Post Processing: 0.008163s Tester: 10140/10144, Detection: 0.1188s, Post Processing: 0.008162s Tester: 10144/10144, Detection: 0.1188s, Post Processing: 0.00816s
It stops when it gets there, I can't got the result. As you can see this question? thanks very much!
@ashnair1 @Roujack @HeyMrYu were you able to use the main_test in the way @ashnair1 said. I get the following error traceback:
Traceback (most recent call last):
File "tester.py", line 115, in
@ashnair1 @Roujack @HeyMrYu were you able to use the main_test in the way @ashnair1 said. I get the following error traceback: Traceback (most recent call last): File "tester.py", line 115, in main() File "tester.py", line 112, in main imdb_detection_wrapper(sym_inst, config, imdb, roidb, context, arg_params, aux_params, args.vis) File "lib/inference.py", line 353, in imdb_detection_wrapper roidb, imdb, arg_params, aux_params, vis])) File "lib/inference.py", line 328, in detect_scale_worker pad_rois_to=400, crop_size=None, test_scale=scale) File "lib/iterators/MNIteratorTest.py", line 11, in init self.num_classes = num_classes if num_classes else roidb[0]['gt_overlaps'].shape[1] KeyError: 'gt_overlaps'
hi, have you sloved the problem, I meet this too.
@ten1er @saksham-s You have to change
test_iter = MNIteratorTestAutoFocus(roidb=roidb, config=config, batch_size=nGPUs * nbatch, nGPUs=nGPUs, threads=32,
pad_rois_to=400, crop_size=None, test_scale=scale)
to
test_iter = MNIteratorTestAutoFocus(roidb=roidb, config=config, batch_size=nGPUs * nbatch, nGPUs=nGPUs, threads=32,
pad_rois_to=400, crop_size=None, test_scale=scale, num_classes=config.dataset.NUM_CLASSES)
I'm trying to optimise the inference time when dealing with a set of images. Using the main_test.py to process 5 images, takes about 15 seconds.
Is this the maximum speed attainable on a single GPU? For reference: I'm currently using a 12 GB Nvidia Titan XP. Do you have any suggestions in improving inference time for a batch of images? Is splitting the images between multiple GPUs the only way to speed up? If so, how can I go about doing that?
Update: The paper states that on a single Tesla V100 GPU you were able run inference on 5 images per second. I wanted to try this out for myself and the following is my result.
As you can see while there is some speedup, the overall process still takes about the same amount of time. Is this the maximum speed at which it can infer or are there additional modifications that could be made to make it faster?
You had mentioned in the paper that you had run multiple processes in parallel during inference. Could you tell me how that was done?