About the training dataset

Barry-Liang commented 2 years ago

Hi, authors. Thanks for sharing the dataset. I am wondering whether the current version in SharePoint is the whole dataset. I have downloaded all the images but there are only 5.6M training images， not as stated in your paper which is 41.2M. In addition， the group ids cannot cover all the options from (0,0,0) to (N, N, L). Do you save the train folders according to group id？ Otherwise， if the images in Sharepoint are randomly picked， I think the group IDs should contain all the NNL options. I am looking forward to your early reply.

gmberton commented 2 years ago

Hi @Barry-Liang,

many thanks for your interest in our work.

Did you have a look at the README.md in the dataset? It explains that if you want all 41.2M images, you should download the "raw" folder, which contains all the 3.43M 360° panoramas (resulting in 41.2M cropped 512x512 images, given that each panorama is split in 12 crops, as explained in the paper). If you think a script to crop the panoramas would be helpful, I can upload it on the drive.

Judging from what you write, you probably downloaded just the smaller subset from the folder "processed", which contains only the 5.6M images effectively used for training (and those are enough to reproduce our results). We created this smaller subset so that it wouldn't be necessary to download the whole 900GB dataset to reproduce our results. If you're confused as to why 5.6M images are enough to reproduce our results, please check again the "Implementation details" sections in the main paper and in the supplementary (in short, it's because we don't use all groups for training).

Again, it's all explained in the README.md of the dataset.

Barry-Liang commented 2 years ago

Thanks for your quick response and I am sorry that I missed the readme part. I will appreciate it if you can share the script to crop the panoramas and the pre-trained models for different backbones.

gmberton commented 2 years ago

Hello, I just uploaded the script to crop the panoramas and the README (in the drive with the dataset). I also changed the path with the panoramas from raw/database to raw/train/panoramas, so before using the script make sure to move the files into the right folder. The pre-trained models are coming soon, I will probably upload them early next week

gmberton commented 2 years ago

Hi @Barry-Liang, all our trained models are now online!

gmberton / CosPlace

About the training dataset #1