jac99 / MinkLocMultimodal

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition
MIT License
96 stars 10 forks source link

RobotCar Seasons #1

Closed qiaozhijian closed 3 years ago

qiaozhijian commented 3 years ago

Hello~ Thanks for your work! How can I reproduce the result of Table 3 in your paper? Can you provide the script that finds LiDAR readings with corresponding timestamps in the in original RobotCar dataset for each image in RobotCar Seasons Datasets?

jac99 commented 3 years ago

Hi, The evaluation on RobotCar Seasons took multiple steps using a combination of Python and Matlab scripts. I've just published some of these scripts in robotcar_seasons_benchmark subfolder in the repo.

First, for each image in the RobotCar Seasons dataset we build a corresponding point cloud, by finding matching LiDAR scans in the full Oxford RobotCar dataset based on timestaps. We build point clouds using the same procedure as in PointNetVlad paper (merging scans from 20 meter traversal, removing ground plane and downsampling to 4096 points). You can download this dataset here. Each point cloud is named with a timestamp - the same timestamp as a corresponding RobotCar Seasons image.

To reproduce the results follow these steps:

  1. Download the dataset with point clouds corresponding to RobotCar Seasons images from the link above.
  2. Run image_pointcloud_path_mapping.py (in robotcar_seasons_benchmark folder in this repo) which produces a pickle with the list of triplets: (Robotcar seasons image timestamp, relative path to the image, relative path to the point cloud_path). These will be used later when computing global descriptors.
  3. Run the code to read the pickle produced in step 2, and compute global descriptor for each element: load the point cloud and the corresponding image, push it through the trained model and save the embedding. The results should be pickled as a dictionary embeddings_d[timestamp] = embedding (256 dimensional ndarray of floats32). Unfortunately I cannot find our code to do this step. But in the link I've given above you can find a pickle with global descriptors computed using our trained multimodal model: season_scan_embeddings_model_MinkLocMultimodal_20210205_1721.pickle. You can check what format it should have.
  4. Finally, run the script: estimate_season_poses.py. It estimates poses of query elements in the RobotCar Sesons based on the global descriptors computed in step 3 and exports them in the format expected by the RobotCar Seasons submission website. season_scan_embeddings_model_MinkLocMultimodal_20210205_1721.txt contain output from this step for our global descriptors.
  5. You can submit the file produced in step 4 to RobotCar seasons evaluation website.

Some of these macros will print a warning that round 40 clouds are missing - but that's expected, as we were not able reconstruct point clouds for about 40 images in RobotCar Seasons dataset.

qiaozhijian commented 3 years ago

Thank you for detailed comment. It helps me a lot. But why 40 images can not find corresponding point cloud? Is the data not provided by Oxford RobotCar dataset? Or can you not find enough point clouds based on the timestamp? On the other hand, for these 40 pictures, I saw that you gave a pose randomly. Maybe you can just use your RGB-only model to give one to make the result a little better. I don't know if it is appropriate to do so.

jac99 commented 3 years ago

Point clouds that are used by our model are not raw point clouds from RobotCar dataset. These points clouds (the same as point clouds used in PointNetVlad or LPD-Net methods) are constructed by merging multiple scans from 2D LiDAR (from the RobotCar dataset) during the 20 meter traversals and pre-processing them. For this purpose I modified matlab code provided by authors of PointNetVlad method. Sometimes this 'merging' process fails, e.g. due to inaccurate GPS information it's not possible to find sufficiently close scans from 2D LiDAR to merge them into one point cloud. But I haven't investigated the root cause of these failures - so I'm not able to tell what exactly went wrong.

Using RGB-only model for cases without point clouds is a good idea and could improve the results. We used the most simplistic approach, to make a random guess.

qiaozhijian commented 3 years ago

Got it. Thanks!