Are the detection results corresponding to the pixels obtained after NMS processing of the pedestrian occupancy map equivalent to the confidence of the anchor-based bounding box in the target detection task?

hou-yz / MVDet

[ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".

https://hou-yz.github.io/publication/2020-eccv2020-mvdet

165 stars 29 forks source link

Are the detection results corresponding to the pixels obtained after NMS processing of the pedestrian occupancy map equivalent to the confidence of the anchor-based bounding box in the target detection task? #17

Closed sjchhhhh closed 4 months ago

sjchhhhh commented 5 months ago

Hello author, thank you very much for your work, which has brought me great inspiration. I have a question I want to know,Are the detection results corresponding to the pixels obtained after NMS processing of the pedestrian occupancy map equivalent to the confidence of the anchor-based bounding box in the target detection task?I want to optimize inference based on this work. I wonder if the average value of this probability value can be used as a performance indicator of inference results without groundtruth.Looking forward to your reply, thanks again！！！

hou-yz commented 5 months ago

Thank you for your interest. The txt results are the ground plane coordinate, and you can retrieve the corresponding 2D bbox for each camera view based on the rectangles.pom file.

if I recall correctly, the all_res.txt should be the network output prior to nms, and the 4th column is the confidence. the test.txt file should be the final output after nms, and it does not include the confidence.

also, feel free to check out the MVDeTr repo, which should have an updated nms function that does not need disk IO.