Closed LZL-CS closed 2 years ago
Hi, in MikLocMultimodal (https://github.com/jac99/MinkLocMultimodal) we do 'late fusion' of features computed from point cloud descriptors and RGB images. These are not aligned using extrinsic parameters. We process the raw data from the dataset (point cloud and image using separate networks) and fuse the resultant descriptors by summation. So if you want to use late fusion approach, similar as in MinkLocMultimodal, you don't need to transfer coordinates using extrinsic data. However, to use 'early fusion' approach, e.g. to augment 3D point cloud with RGB values from the camera, or 'intermediary fusion' (augment point cloud feature map with features computed from RGB image) you'll need to align readings from different sensors using extrinsics. This will create some problems - as lidar has different coverage than RGB camera.
Hi, I got it. very thanks for your reply.
Hi, Thanks for your great work! In your work, you used the 3D point cloud from OxfordRobotCar, which has been pre-processed by PointNetVLAD work, and the 2D camera image. These two data are obtained by sensors in different places on the car, and they can be transformed by extrinsic data. If I want to use raw point clouds from a LiDAR for place recognition or related purposes, such as Oxford Radar RobotCar (https://oxford-robotics-institute.github.io/radar-robotcar-dataset/). Do I need to first transfer the point cloud coordinate to the camera by extrinsic data?