SamsungLabs / fcaf3d

[ECCV2022] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
MIT License
231 stars 37 forks source link

Some question on code (About the origin in class DepthInstance3DBoxes ) #24

Closed wtt0213 closed 2 years ago

wtt0213 commented 2 years ago

Thanks for your great work, when I read the code, I have some confuse.

In the file mmdet3d/models/dense_heads/fcaf3d_neck_with_head.py, we can see function loss_single for train, and get_box_single for evaluation. I just want to know, the origin of the box predicated by network is (0.5, 0.5, 0) or (0.5, 0.5, 0.5). Because I see that when we get the box_loss in function '_loss_single', the gt_box is convert to the class with origin(0.5, 0.5, 0.5), and the box predicated by the network do nothing(after _bbox_pred_to_bbox), then we get the loss_bbox. So we can think the p red box's origin is (0.5, 0.5, 0.5). But when we evaluate the network, the predicated box in function '_get_box_single' is convert to box with origin(0.5, 0.5, 0.5), then the boxes are evaluated by the function('indoor_eval') with gt_box which was convert to box with origin(0.5, 0.5, 0,5)

So I confused with above code, thus, when I using the box to test some others tasks, I have no idea to use the original box or convert it to the box with origin(0.5, 0.5, 0.5)

filaPro commented 2 years ago

Hi @wtt0213 ,

Both the input of the Fcaf3DNeckWithHead on training step and the output on the test step are the instances of DepthInstance3DBoxes. So for your tasks you just need to be familiar with this class, you can check its documentation here, including origin, corners, gravity_center methods to be sure where the origin is. Does this answer help?

wtt0213 commented 2 years ago

so what you mean is that the output of the network (pred_box) is directly with origin(0.5, 0.5, 0), which the pred_box[:3] is the center of the bottom face? or pred_box[:3] is just center of the box?

looking forward to your reply

filaPro commented 2 years ago

bboxes here is the instance of DepthInstance3DBoxes. This means that it has self.tensor with self.tensor[:, :3] containing the bottom centers following DepthInstance3DBoxes documentation.

wtt0213 commented 2 years ago

But the box here is just a tensor, which is mismatch with the gt_box here which is a DepthInstance3DBoxes, using these two type of box to compute box_loss is confused me

filaPro commented 2 years ago

This is caused by the general convention of mmdetection3d, including BaseInstance3dBoxes for exchange between dataloader, model and evaluator. However the model subparts and the losses take Tensor as an input.