Visual Localization using other datasets for evaluation

Akola-Mbey-Denis commented 1 month ago

Thanks for your work.

Could you please provide a generic procedure for using dust3r with other visual localization datasets, such as the Extended CMU Seasons and RobotCar Seasons datasets? Specifically, I would like to understand how to restructure the data.

I have reviewed the information provided here (https://github.com/naver/dust3r/blob/main/dust3r_visloc/README.md), but it is not clear how you got the content of sub-directories like 'mapping' for instance in the Cambridge Landmark dataset. Could you elaborate on that process?

Additionally, if you have any scripts for downloading the Cambridge Landmark dataset in a more efficient manner, could you please share them with us?

Thanks for your assistance.

yocabon commented 1 month ago

I don't have any script to download Cambridge Landmark, I think you can find each of the 6 scenes in https://www.repository.cam.ac.uk/ (search for "Research data supporting PoseNet").

So there are two types of datasets. For rgb-d datasets (7-scenes, InLoc), we used the kapture versions of the datasets directly.

For sfm datasets (CambridgeLandmarks, AachenDayNight), we used the kapture versions of the dataset, the kapture mapping subset was used with https://github.com/naver/kapture-localization/blob/main/pipeline/kapture_pipeline_mapping.py (1, 2) to get a colmap reconstruction (the colmap reconstruction is what you see under mapping in the documentation). That being said, you can obtain a colmap reconstruction however you like.

For AachenDayNight we used R2D2 40k keypoints (r2d2_WASF_N8_big) and I think the fusion pairs top50 (Fusion with GHarm gamma=0.5 of Resnet101-AP-GeM-LM18 https://github.com/naver/deep-image-retrieval, delg r101 gldv2clean https://github.com/tensorflow/models/tree/master/research/delf/delf/python/delg and OpenIBL vgg16_netvlad.pth https://github.com/yxgeee/OpenIBL) but any good pairsfile (how, fire...) would work fine, we just used the same map as https://www.visuallocalization.net/details/34883/.

For Cambridge Landmark, I believe the colmap reconstruction was obtained with the same r2d2 model but with 20k keypoints, and AP-GeM-LM18_top50 pairs.

Once you got all the data, you create a class in https://github.com/naver/dust3r/blob/main/dust3r_visloc/datasets/. For sfm datasets; it should be very easy, see https://github.com/naver/dust3r/blob/main/dust3r_visloc/datasets/cambridge_landmarks.py

Akola-Mbey-Denis commented 1 month ago

Thanks, @yocabon. That was really helpful.

I just wanted to further clarify some things regarding the Kapture folder structure. It seems ambiguous at the moment. Can you clarify which one is the correct folder structure? I see both this one and this one.

Also, regarding creating both local and global features , do we run this script , and this script on the contents of the mapping, query, and map_plus_query subfolders individually? If not, which content ? If a particular dataset could provide better context, consider the Cambridge Landmark dataset.

yocabon commented 1 month ago

In kapture, we made extensive use of symlinks, so while these examples look different, they are in fact the same. let me explain:

https://github.com/naver/kapture?tab=readme-ov-file#3-example-file-structure here, you have the structure of a single kapture directory (there are multiple kapture for a single dataset, the mapping subset, the query subset, we can merge these two to have the full dataset... and we use symlinks to avoid duplicating the data).
Usually, sensors/records_data, reconstruction/keypoints, reconstruction/descriptors, recontruction/global_features, reconstruction/matches are just symlinks.
reconstruction/ can be empty.

https://github.com/naver/kapture-localization/blob/main/doc/tutorial.adoc#recommended-dataset-structure is the recommended file structure for organizing the dataset. We would then make "proxy kapture", which are temporary versions with symlinks to the local features, global features, matches... and run r2d2, apgem, mapping on that.

I realized that we have an example for cambridge here https://github.com/naver/kapture-localization/blob/main/pipeline/examples/run_cambridge.sh (it's using RETRIEVAL_TOPK=20 instead of 50)

Once done, you can make a Cambridge_Dust3r with symlinks to get this folder structure https://github.com/naver/dust3r/blob/main/dust3r_visloc/README.md#cambridgelandmarks, the pairsfile / colmap parts should be saved somewhere in the outputs of these scripts

Akola-Mbey-Denis commented 1 month ago

Thank you for your reply.

I am having been trying to reproduce your results in Tab. 1 for the Cambridge Landmark dataset. My results do not measure up to what you have in the main paper.

Please can you further clarify about these :

Cambridge_Landmarks
├─ mapping ( is this created from the colmap run on query or the mapping subset  or mapping_plus_query?)
│   ├─ GreatCourt
│   │  └─ colmap/reconstruction
│   │     ├─ cameras.txt
│   │     ├─ images.txt
│   │     └─ points3D.txt
├─ kapture
│   ├─ GreatCourt
│   │  └─ query  # https://github.com/naver/kapture/blob/main/doc/datasets.adoc#cambridge-landmarks  (Is this the query images or kapture folder for the query subset?)
│   ... 
├─ GreatCourt 
│   ├─ pairsfile/query  (is this created from the colmap run on the query subset or the mapping subset ?)
│   │     └─ AP-GeM-LM18_top50.txt  # https://github.com/naver/deep-image-retrieval/blob/master/dirtorch/extract_kapture.py followed by https://github.com/naver/kapture-localization/blob/main/tools/kapture_compute_image_pairs.py
│   ├─ seq1
│   ...
...

Also, from the folder structure above (which I copied from visloc readme, can you clarify if you used top 20 ( as written in the main paper or top 50 in readme reference (https://github.com/naver/dust3r/tree/9869e71f9165aa53c53ec0979cea1122a569ade4/dust3r_visloc)

I appreciate your clarification.

yocabon commented 1 month ago

mapping/GreatCourt/colmap/reconstruction is the reconstruction obtained from the mapping subset only (with pairs computed between the mapping subset and the mapping subset (itself).

-> can you clarify if you used top 20 ( as written in the main paper or top 50 in readme reference First, I made a mistake when I wrote the visloc command for CambridgeLandmarks (I probably copied 7scenes, so it was doing top1). the file that contains the pairs was written for top50, but we only use the top20 subset (pairsfile='APGeM-LM18_top50', topk=20)

Akola-Mbey-Denis commented 1 month ago

Thank you for the clarification. You skipped this question.

├─ GreatCourt 
│   ├─ pairsfile/query 
│   │     └─ AP-GeM-LM18_top50.txt  # https://github.com/naver/deep-image-retrieval/blob/master/dirtorch/extract_kapture.py followed by https://github.com/naver/kapture-localization/blob/main/tools/kapture_compute_image_pairs.py
│   ├─ seq1
│   ...
...

is the pairsfile/query created from the colmap run on the query subset or the mapping subset also?

yocabon commented 1 month ago

pairsfile/query is obtained by doing retrieval between the query and the mapping subsets. In the example above, the would be

https://github.com/naver/kapture-localization/blob/main/pipeline/examples/run_cambridge.sh#L171 https://github.com/naver/kapture-localization/blob/main/pipeline/kapture_pipeline_localize.py#L132

naver / mast3r

Visual Localization using other datasets for evaluation #20