j96w / DenseFusion

"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
https://sites.google.com/view/densefusion
MIT License
1.1k stars 300 forks source link

Code problem #162

Open zuoligang1997 opened 4 years ago

zuoligang1997 commented 4 years ago

First question: Don't understand why points are added below: pred = torch.add(torch.bmm(model_points, base), points + pred_t) The second question is about iterative optimization. Why is the following code written like this? ` ori_base = ori_base[which_max[0]].view(1, 3, 3).contiguous() ori_t = t.repeat(bs num_p, 1).contiguous().view(1, bs num_p, 3) new_points = torch.bmm((points - ori_t), ori_base).contiguous()

new_target = ori_target[0].view(1, num_point_mesh, 3).contiguous()
ori_t = t.repeat(num_point_mesh, 1).contiguous().view(1, num_point_mesh, 3)
new_target = torch.bmm((new_target - ori_t), ori_base).contiguous()`

Thirdly, the thesis mentioned that pose is iterative and precise step by step, but I don't see this implementation in the code. Ok? 图片

j96w commented 4 years ago

Hi,

1). The reason why points are added to the pred_t is that our prediction of the object centroid is an offset translation. Since we are performing a per-point wise pose estimation, the translation estimation from each input point is an offset translation starting from the original 3D location of that point. This is how we add difference to the results from each input point and the network is trained unsupervisedly to choose which one should be the best prediction.

2). The main idea of how we achieve the iterative refinement is that: in each iteration, after acquired the predicted 6D pose, we inverse the pose result and directly apply it to the input pointcloud, which will later be used as the input for the next iteration. These lines of code are doing this inverse transformation.

3). This process is implemented as a for-loop in our code. Please refer to either DenseFusion/tools/eval_ycb.py or DenseFusion/tools/eval_linemod.py, where my_pred are iteratively updated in each refinement step.