【Bug】Evaluate error on images without human/ Save results error（针对无人的图片evaluate时会报错，保存结果会报错）

问题概述

你好，我这边试着训了几轮然后尝试做验证，结果发现有些问题

针对空标注的图（即没有行人）的情况会出bug

python main.py -m dab_deformable_detr --resume ../fisheye-train/output/checkpoint.pth --two_stage --eval

尝试修改上面出现的bug，想进一步保存结果时发现会出bug

python main.py -m dab_deformable_detr --resume ../fisheye-train/output/checkpoint.pth --two_stage --eval --save_results

错误报告

无人图片

Original Traceback (most recent call last): File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "LOAF/fisheye-eval/datasets/loaf.py", line 48, in getitem img, target = self._transforms(img, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 198, in call image, target = t(image, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 198, in call image, target = t(image, target) File "LOAF/fisheye-eval/datasets/transform_loaf.py", line 170, in call return F.to_tensor(img), target File "/opt/conda/lib/python3.8/site-packages/torchvision/transforms/functional.py", line 137, in to_tensor raise TypeError(f"pic should be PIL Image or ndarray. Got {type(pic)}") TypeError: pic should be PIL Image or ndarray. Got <class 'NoneType'>
保存结果

Traceback (most recent call last): File "main.py", line 416, in main(args) File "main.py", line 322, in main test_stats, coco_evaluator = evaluate(model, criterion, postprocessors, File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "LOAF/fisheye-eval/engine.py", line 241, in evaluate res_info = torch.cat((_res_bbox, _res_prob.unsqueeze(-1), _res_label.unsqueeze(-1)), 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 300 but got size 100 for tensor number 1 in the list.

原因分析

通过检查发现是由于LOAF数据集中有些图片本身就没有行人（如id：1164，0062_01065.jpg），遇到这些图时，

代码得到的标注是空的，所以在执行try中的stack语句会报错，从而被except接收返回img=None, target=None https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/loaf.py#L75-L86

进一步self.prepare返回的都是None，在后续的执行预处理时报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/loaf.py#L46-L48
发现是模型输出的predboxes维度跟后处理后的维度不一致，如下面两处代码所示 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/engine.py#L218-L226 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/engine.py#L237-L240
- outbbox代表模型输出的结果，通过检查发现维度为(300,4)；而后面两处_res_prob和_res_label来自后处理得到的结果，维度是(100,1)，它们的维度不一致，无法拼接

尝试修改

一开始尝试修改代码，使得遇到空图像时，target返回的内容里面子项的结果是空列表tensor
1. 首先我尝试在torch.stack前加入判断，如果boxes为空则直接用个空tensor代替，然后继续后续的操作
```
  try:
      if len(boxes):
          boxes = torch.stack(boxes, dim=0)
      else:
          boxes = torch.tensor(boxes)
```
  1. 然后修改预处理中Normalize的方法，由于此时boxes可能为空列表，因此将下面这一段代码直接改成boxes /= w避免报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/datasets/transform_loaf.py#L187
1. 但是这些修改完后也会在调用HungarianMatcher进行匹配时，调用torch.cdist时因为tgt_bbox是空列表tensor而报错 https://github.com/BUPT-PRIV/LOAF/blob/a5c435eb013f0729c64824477b62649f8325a5a7/fisheye-eval/models/dab_deformable_detr/matcher.py#L87-L88
2. 修改到这由于我对DAB_DETR不是很熟悉，不确定后续再修改下去是否会违背原本的意思
- 后来又尝试直接在上面修改后直接在评估中跳过这些空图片，能够正常得出结果，但肯定跟预期不符，
这里同样我对DAB_DETR不熟悉，不确定这里要怎么调整使得后处理前后的维度保持一致

期望

因此想问下仓库这版本的代码是可以运行的最终版嘛，你们那边有没有类似的情况，能否提供下解决方案？
并且希望能新增下丰富下文档说明
1. 比如near/middle/far的AP指标如何得到的？
2. Person localization results的指标如何获得？
3. 如何获得每张图的结果而非指标

单纯跑eval只给了下面的结果

只跑了几轮，所以性能没有文章那么高 IoU metric: bbox Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.431 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.779 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.428 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.649 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.500 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.075 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.441 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.555 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.428 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.657 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.664

BUPT-PRIV / LOAF

【Bug】Evaluate error on images without human/ Save results error（针对无人的图片evaluate时会报错，保存结果会报错） #4

问题概述

错误报告

原因分析

尝试修改

期望