happinesslz / EPNet

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection(ECCV 2020)
MIT License
230 stars 37 forks source link

some ques #11

Closed huixiancheng closed 3 years ago

huixiancheng commented 3 years ago

非常感谢您的开源代码,我有一些问题想要咨询 1.Why is it important to ensure that the number of points sampled is equal to the number of points that can be projected onto the rgb img? https://github.com/happinesslz/EPNet/blob/0123c341243846aa3b412addcb9e2c07fd305237/lib/datasets/kitti_rcnn_dataset.py#L326-L350 It seems to be the same setting as POINTRCNN. Does it take mean that in the whole 3D scene, only the point cloud of 90 degrees of front view is used? QQ截图20210129193852

2How to understand this part of the code? https://github.com/happinesslz/EPNet/blob/0123c341243846aa3b412addcb9e2c07fd305237/lib/net/pointnet2_msg.py#L214-L226 I personally think that li_index is the point index of SA sampled from 16384 point clouds, then l_xy_cor is the HW coordinate of the point cloud projected onto the rgb image. Get the correspondence between 4096/1024/256/64 points(after SA sampling) and the image convolution feature map by torch.gather, and then use Feature_Gather to fusion lidar and image? What should the dimensions of li_index and l_xy_cor look like? Take the first SA module as an example ,the dimension of the li_index before toch.gather is [1, 4096, 2] but it looks like the dimension of l_xy_cor is [1,16384, 2]

3 is it necessary to normalize xy before LI_fusion?

happinesslz commented 3 years ago

@huixiancheng Thanks for your attention!

  1. The setting is same with PointRCNN, only the points with the range of PC_AREA_SCOPE is considered.

  2. l_xy_cor is a list. In fact, we only fuse the lidar and camer features with the number of sampling points of 4096/1024/256/64 in SA operations.

  3. we have normalized before feature gather operation, please refer to normalize xy to [-1,1]

huixiancheng commented 3 years ago

Thanks for your reply :bow: