BUPT-PRIV / LOAF

89 stars 9 forks source link

【Bug】Evaluate error on images without human/ Save results error(针对无人的图片evaluate时会报错,保存结果会报错) #4

Open Terry-Zheng opened 4 months ago

Terry-Zheng commented 4 months ago

问题概述

你好,我这边试着训了几轮然后尝试做验证,结果发现有些问题

  1. 针对空标注的图(即没有行人)的情况会出bug
    python main.py -m dab_deformable_detr --resume ../fisheye-train/output/checkpoint.pth --two_stage --eval
  2. 尝试修改上面出现的bug,想进一步保存结果时发现会出bug
    python main.py -m dab_deformable_detr --resume ../fisheye-train/output/checkpoint.pth --two_stage --eval --save_results

错误报告

  1. 无人图片

    Original Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "LOAF/fisheye-eval/datasets/loaf.py", line 48, in getitem img, target = self._transforms(img, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 198, in call image, target = t(image, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 198, in call image, target = t(image, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 170, in call return F.to_tensor(img), target File "/opt/conda/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 137, in to_tensor raise TypeError(f"pic should be PIL Image or ndarray. Got {type(pic)}") TypeError: pic should be PIL Image or ndarray. Got <class 'NoneType'>

  2. 保存结果

    Traceback (most recent call last): File "main.py", line 416, in main(args) File "main.py", line 322, in main test_stats, coco_evaluator = evaluate(model, criterion, postprocessors, File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "LOAF/fisheye-eval/engine.py", line 241, in evaluate res_info = torch.cat((_res_bbox, _res_prob.unsqueeze(-1), _res_label.unsqueeze(-1)), 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 300 but got size 100 for tensor number 1 in the list.

原因分析

  1. 通过检查发现是由于LOAF数据集中有些图片本身就没有行人(如id:1164,0062_01065.jpg),遇到这些图时,

    代码得到的标注是空的,所以在执行try中的stack语句会报错,从而被except接收返回img=None, target=None https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/loaf.py#L75-L86

    进一步self.prepare返回的都是None,在后续的执行预处理时报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/loaf.py#L46-L48

  2. 发现是模型输出的predboxes维度跟后处理后的维度不一致,如下面两处代码所示 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/engine.py#L218-L226 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/engine.py#L237-L240

    • outbbox代表模型输出的结果,通过检查发现维度为(300,4);而后面两处_res_prob_res_label来自后处理得到的结果,维度是(100,1),它们的维度不一致,无法拼接

尝试修改

  1. 一开始尝试修改代码,使得遇到空图像时,target返回的内容里面子项的结果是空列表tensor

    1. 首先我尝试在torch.stack前加入判断,如果boxes为空则直接用个空tensor代替,然后继续后续的操作
        try:
            if len(boxes):
                boxes = torch.stack(boxes, dim=0)
            else:
                boxes = torch.tensor(boxes)
      1. 然后修改预处理中Normalize的方法,由于此时boxes可能为空列表,因此将下面这一段代码直接改成boxes /= w避免报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/transform_loaf.py#L187
    1. 但是这些修改完后也会在调用HungarianMatcher进行匹配时,调用torch.cdist时因为tgt_bbox是空列表tensor而报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/models/dab_deformable_detr/matcher.py#L87-L88

    2. 修改到这由于我对DAB_DETR不是很熟悉,不确定后续再修改下去是否会违背原本的意思

    • 后来又尝试直接在上面修改后直接在评估中跳过这些空图片,能够正常得出结果,但肯定跟预期不符,
  2. 这里同样我对DAB_DETR不熟悉,不确定这里要怎么调整使得后处理前后的维度保持一致

期望

单纯跑eval只给了下面的结果

只跑了几轮,所以性能没有文章那么高 IoU metric: bbox Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.431 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.779 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.428 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.649 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.075 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.441 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.555 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.428 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.657 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.664

ompugao commented 3 months ago

--num_queries 100 will solve the second problem. I am looking forward to finding the solution for the first problem.