OpenRobotLab / EmbodiedScan

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
https://tai-wang.github.io/embodiedscan/
Apache License 2.0
395 stars 26 forks source link

[Docs] Will the acc@0.25 >1. in 3DVG? #50

Closed iris0329 closed 2 months ago

iris0329 commented 2 months ago

Branch

main branch https://mmdetection3d.readthedocs.io/en/latest/

📚 The doc issue

  iou = top_bbox.overlaps(top_bbox, gt_bboxes)  # (num_query, 1)

  for t in self.iou_thr:
      threshold = iou > t
      found = int(threshold.any())
      if view_dep:
          gt["View-Dep@" + str(t)] += 1
          pred["View-Dep@" + str(t)] += found
      else:
          gt["View-Indep@" + str(t)] += 1
          pred["View-Indep@" + str(t)] += found
      if hard:
          gt["Hard@" + str(t)] += 1
          pred["Hard@" + str(t)] += found
      else:
          gt["Easy@" + str(t)] += 1
          pred["Easy@" + str(t)] += found
      if unique:
          gt["Unique@" + str(t)] += 1
          pred["Unique@" + str(t)] += found
      else:
          gt["Multi@" + str(t)] += 1
          pred["Multi@" + str(t)] += found

      gt["Overall@" + str(t)] += 1
      pred["Overall@" + str(t)] += found

header = ["Type"]
header.extend(object_types)
ret_dict = {}

for t in self.iou_thr:
  table_columns = [["results"]]
  for object_type in object_types:
      metric = object_type + "@" + str(t)
      value = pred[metric] / max(gt[metric], 1)
      ret_dict[metric] = value
      table_columns.append([f"{value:.4f}"])

  table_data = [header]
  table_rows = list(zip(*table_columns))
  table_data += table_rows
  table = AsciiTable(table_data)
  table.inner_footing_row_border = True
  print_log("\n" + table.table, logger=logger)

I printed the shapes of top_bbox and gt_bboxes:

 top_bbox.shape      torch.Size([10, 9])
 gt_bboxes.shape     torch.Size([1, 9])

From what I understand, when gt is increased by one, pred can be increased by a maximum of found (could be num_query). It is possible that the value of pred is much larger than gt. In this case, the value = pred[metric] / max(gt[metric], 1) may be greater than 1.

I look forward to your reply.

Tai-Wang commented 2 months ago

Note that found = int(threshold.any()) can only be 0 or 1, so pred is always smaller than gt. Therefore, the accuracy is always smaller than 1.

BTW, the current metric is only suitable for the current benchmark with only one gt box. When we have multiple boxes in a data sample to ground, we will adjust the metric afterward. Please stay tuned for further updates.

iris0329 commented 2 months ago

@Tai-Wang, thank you for the detailed response. I just found that the value of found only be 0 or 1.