pangsu0613 / CLOCs

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
MIT License
352 stars 68 forks source link

Bug in fusion process #36

Closed Treemann closed 3 years ago

Treemann commented 3 years ago

Hi Pang,

Thanks for your contribution~

I notice that there is a bug in your code: voxelnet.py L498-L501.

            box_2d_detector = np.zeros((200, 4))
            box_2d_detector[0:top_predictions.shape[0],:]=top_predictions[:,:4]
            box_2d_detector = top_predictions[:,:4]
            box_2d_scores = top_predictions[:,4].reshape(-1,1)

Actually, the shape of "box_2d_detector" varies at each iter (instead of remaining 200) with your implementation. Thus assigning values to "out_1" according to the coordinates in fusion.py L48-L50 is wrong logically. But due to the maxpooling operation along the axis of box_2d, I think this bug has no effect on the results. I think we could replace the aforementioned code with:

            box_2d_detection = np.zeros((200, 5)) 
            box_2d_detection[0:top_predictions.shape[0],:]=top_predictions[:,:5]
            box_2d_detector = box_2d_detection[:,:4]
            box_2d_scores = box_2d_detection[:,4].reshape(-1,1)

Or pass the shape to fusion_layer.forward()...

pangsu0613 commented 3 years ago

Hello @Treemann, thank you for your interests in CLOCs and pointing this out. There is some legacy code there, a while ago, 200 is set as the maximum number of 2D detections. You are right, the shape of box_2d_detector varies at each iter, but it is smaller than 200, so orginally we only fill out the number of 2D detections and leave the rest as empty, these empty ones does not have any impact on the fusion and also won't affect the indices in the original sparse input tensor. Anyway, this issue currenly does not affect the fusion. We will clean and fix this in the next update and thank you very much for pointing this out.