YuHengsss / YOLOV

This repo is an implementation of PyTorch version YOLOV Series
Apache License 2.0
278 stars 39 forks source link

关于VOLOV的实时目标检测文件tools/yolov_demo_online中的imageflow_demo,请问应该怎么实现摄像头的识别 #56

Open widydsg opened 1 year ago

widydsg commented 1 year ago

我对目标检测不是很了解,imageflow_demo这段代码在我理解中是把视频每一帧都存在原始帧列表中,再进行目标检测,请问它是否达不到实时目标检测?

while True:
    ret_val, frame = cap.read()
    if ret_val:
        ori_frames.append(frame)
        frame, _ = predictor.preproc(frame, None, exp.test_size)
        frames.append(torch.tensor(frame))
    else:
        break
res = []
frame_len = len(frames)
index_list = list(range(frame_len))
tmp_bank = [[], [], [], []]
local_bank = [[], [], [], []]
for frame_num, frame in enumerate(frames):
    tmp_imgs = []

    img = frame
    tmp_imgs.append(img)
    # if frame_num == 0:
    #     tmp_imgs = tmp_imgs + frames[-31:]
    imgs = torch.stack(tmp_imgs
    other_result = online_previous_selection(tmp_bank, local_bank=local_bank,
                                             local=True)
    pred_result, res_dict = predictor.inference(imgs, other_result)
    # print(res_dict)
    # print(len(tmp_imgs))
    N = int(res_dict['cls_scores'].shape[0] / len(tmp_imgs)) 
    for i in range(len(tmp_imgs)):
        tmp_bank[0].append(res_dict['cls_feature'][0, N * i:N * (i + 1)])
        tmp_bank[1].append(res_dict['reg_feature'][0, N * i:N * (i + 1)])
        tmp_bank[2].append(res_dict['cls_scores'][N * i:N * (i + 1)])
        tmp_bank[3].append(res_dict['reg_scores'][N * i:N * (i + 1)]) 
        if res_dict['msa'] != None: 
            local_bank[0].append(res_dict['msa'][N * i:N * (i + 1)]) 
            local_bank[1].append(res_dict['boxes'][N * i:N * (i + 1)])
            local_bank[2].append(res_dict['cls_scores'][N * i:N * (i + 1)]) 
            local_bank[3].append(res_dict['reg_scores'][N * i:N * (i + 1)]) 
    for i in range(4):
        tmp_bank[i] = tmp_bank[i][-600:]  
        local_bank[i] = local_bank[i][-600:]
    outputs.extend(pred_result)
# 对于每一个预测结果,将其可视化并保存
for output, img in zip(outputs, ori_frames[:len(outputs)]): 
    result_frame = predictor.visual(output, img, ratio, cls_conf=args.conf)
    if args.save_result:
        vid_writer.write(result_frame)
YuHengsss commented 1 year ago

hello,如果存的都是过去帧,就可以进行实时的目标检测了