Incorporating Camera-Lidar fusion

Arjun191 commented 3 weeks ago

Hello! Thank you for your work! Looks very exciting.

Is there potential for lidar-camera based BEV Localization? Is the work modular enough to enable such extension?

Regards

zjuluolun commented 2 weeks ago

Hi, @Arjun191. I'm not sure if you're asking about lidar-to-camera cross-localization or lidar-camera fusion enhanced localization. For lidar-to-camera cross-localization, the easiest way to bring in BEVPlace++ for place recognition is by projecting the image features into BEV using popular PV2BEV networks like LSS, FastBEV, or BEVFormer. When it comes to fine-grained 6DoF lidar-camera pose estimation (or calibration), BEVPlace++ might give you a rough 3-DoF estimation but might not be the best fit due to the scale and depth ambiguity of the BEV of mono images.

On the other hand, lidar-camera fusion enhanced localization, which combines PV features and LiDAR features, is generally a much easier problem. Here, the extrinsic calibration between the two sensors is usually given. You can fuse the features using methods like self and cross-attention to boost localization accuracy. You also use BEVPlace++ to perform localization after obtaining the enhanced BEV features.

Arjun191 commented 2 weeks ago

I should have been more clear in my question. I was talking about Lidar-camera fusion enhanced localisation.

Since the dataset you have already has prepared BEV images, I was wondering if there are existing camera images with known extrinsic parameters which I could use.

What would the pipeline then look like?

zjuluolun commented 2 weeks ago

Yes, you could find the images and extrinsic params on the official websites of KITTI and NCLT. Just go to the websites to download the files and have fun!

zjuluolun / BEVPlace

Incorporating Camera-Lidar fusion #13