Nanne / pytorch-NetVlad

Pytorch implementation of NetVlad including training on Pittsburgh.
435 stars 109 forks source link

how to get the pittsburgh250k dataset? #19

Closed waitingjiang closed 4 years ago

waitingjiang commented 4 years ago

I achieved the performance of

Recall Scores:                                                                                           
  top-1          86.1%                                                                                     
  top-5          93.0%                                                                             
  top-10         95.0%

by training from conv3 of vgg16 with learning rate of 0.0001 and applying PCA+whitening followed by L2 normalization (as the original paper introduced) in the inference.

training and testing on the pitts-30k dataset.

Originally posted by @yxgeee in https://github.com/Nanne/pytorch-NetVlad/issues/8#issuecomment-541579659

waitingjiang commented 4 years ago

Below is the raw email with data links sent by Relja Arandjelovic relja@relja.info<mailto:relja@relja.info>.

Best Regards

Jiang Nicong


Hi,

Thank you for your interest in our work. The links towards datasets are listed below.

If you are using the NetVLAD code that we provide, then you just need to:

NetVLAD: CNN architecture for weakly supervised place recognition - di.ens.frhttp://www.di.ens.fr/willow/research/netvlad/

www.di.ens.frhttp://www.di.ens.fr/

Our trained NetVLAD descriptor correctly recognizes the location (b) of the query photograph (a) despite the large amount of clutter (people, cars), changes in viewpoint and completely different illumination (night vs daytime).

Best Regards,

Relja

The datasets can be downloaded here:

― 24/7 Tokyo ---

  1. Please find the 24/7 Tokyo GSV perspective images at:

https://www.dropbox.com/sh/0l04qchbc73kigr/AAAXPuB1J7aD77VJorB-OhTYa?dl=0

  1. Queries

http://www.ok.ctrl.titech.ac.jp/~torii/project/247/

Version 2 is what was used in that paper, as well as my NetVLAD paper, so if you want to compare results, you should use this one. Version 3 contains some additional images and/or ground truth changes - these are from the PAMI version of our Tokyo 24/7 paper.

Please note that the query images should be downscaled to 640x480 for fair comparison with our work.

― Tokyo Time Machine ―

www.di.ens.fr/willow/research/netvlad/data/tokyoTM/tokyoTimeMachine_v100.tar.XXhttp://www.di.ens.fr/willow/research/netvlad/data/tokyoTM/tokyoTimeMachine_v100.tar.XX

where the final XX denotes the relevant part, from 00 to 13. Each part is about 954 MB (the last one is 552 MB).

md5sum:

d2baaf42831eec223634f0bc7b5a1f50 tokyoTimeMachine_v100.tar.00

ea67e41022639c0adf7ddc10bd19d686 tokyoTimeMachine_v100.tar.01

36135c11940d92c1cd1f229352e89a91 tokyoTimeMachine_v100.tar.02

7746d848a6959851314af761ddba307d tokyoTimeMachine_v100.tar.03

19c3a3b52781d499e19a17b4883a4152 tokyoTimeMachine_v100.tar.04

e61c01ff4961249cf27f08d6e803126a tokyoTimeMachine_v100.tar.05

dab5dba2a9b6683822770e0b318c5f7a tokyoTimeMachine_v100.tar.06

208857c3bdfc2c172125e2dfa827c32c tokyoTimeMachine_v100.tar.07

2a0217c77d392ee98919458d1e2e668b tokyoTimeMachine_v100.tar.08

99438c14250224a227e0306287a23002 tokyoTimeMachine_v100.tar.09

c4697d5f43693548a33f90a3f7c85d45 tokyoTimeMachine_v100.tar.10

1c3db6757b68747f505d5ac1ea5ff5ce tokyoTimeMachine_v100.tar.11

6228047e29b9506d9b8c25daf1024967 tokyoTimeMachine_v100.tar.12

eaddeb529fc3853ccaadb379f1f82f19 tokyoTimeMachine_v100.tar.13

To uncompress, do:

cat tokyoTimeMachine.tar | tar -xf -

― Pittsburgh 250k ―

  1. Please find the Pittsburgh 250k at:

https://www.dropbox.com/sh/taaoxupfd8ccgg6/AABhnnwmnYEhOD54LKF8XPFna?dl=0

md5sum's:

3edec2fac461f4a740a21d509ad6e30e 000.tar

c97e5fffd53a2387f1f57b3ec3cae5cb 001.tar

afab887a76f8c4b5c017bd51ea105f5c 002.tar

f217320114299249ea6ebbd71eeb76de 003.tar

00ae08d9020b953c4e1125da5054577c 004.tar

be3fd968f96ebf2548adec23e94a8eb8 005.tar

89818a0841ec98a7b017ded678372652 006.tar

16cbd960c37715e73166d1c1361e337b 007.tar

589ce95e50038dccd012e4fc429bd3e3 008.tar

3f9de2504fddf140f17c6c8b4315a09e 009.tar

5175571cf6f0061aced10fb65e0c9691 010.tar

  1. The query images are in queries_real.tar. It containes the 24000 query images that are actually used.

md5sum:

5611c4d601c92f6c7dd702fc2716914a queries_real.tar


发件人: Liu Mingxuan notifications@github.com 发送时间: 2020年8月24日 7:36 收件人: Nanne/pytorch-NetVlad pytorch-NetVlad@noreply.github.com 抄送: waitingjiang nicongjiang@outlook.com; State change state_change@noreply.github.com 主题: Re: [Nanne/pytorch-NetVlad] how to get the pittsburgh250k dataset? (#19)

how did you get the 30k dataset?

― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/Nanne/pytorch-NetVlad/issues/19#issuecomment-678962087, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGDGOIPZNJGKCFGT6GWFI2DSCIJ7HANCNFSM4JXJND5A.

SharangKaul123 commented 3 years ago

@waitingjiang Hey! Hope you are fine. When I go to the dropbox link (posted by you) for downloading Pittsburg 250K dataset, it throws an error for downloading. I think the link is pretty old for now, and you might have something from where I can download the image datasets.

XiaozhuLove commented 2 years ago

@waitingjiang Hey! Hope you are fine. When I go to the dropbox link (posted by you) for downloading Pittsburg 250K dataset, it throws an error for downloading. I think the link is pretty old for now, and you might have something from where I can download the image datasets. @SharangKaul123 Hello!Hope you are fine. I meet the same problem , Could you tell me how you finally solved it (download the image datasets.) thank you very much!

Rechal0703 commented 2 years ago

@waitingjiang Hey! Hope you are fine. When I go to the dropbox link (posted by you) for downloading Pittsburg 250K dataset, it throws an error for downloading. I think the link is pretty old for now, and you might have something from where I can download the image datasets. @SharangKaul123 Hello!Hope you are fine. I meet the same problem , Could you tell me how you finally solved it (download the image datasets.) thank you very much!

Hey!I had the same question as you so emailed Relja and here is his reply. It has Pittsburgh250k, Tokyo247 and TokyoTimeMachine in it, but it doesn't seem to include the separate prepackaged of Pittsburgh30k. if you know how to find it can you let me know the link?


< removed on request >

SharangKaul123 commented 2 years ago

@Rechal0703 Hey! As mentioned by Relja, there is no separate prepackaged archive available for Pittsburg30k but you can download the dataset configurations in the form of .mat files, convert them to .pickle files (if you are using python), read them and then look at which images are used for the Pittsburg30k dataset. As far as I remember, if you are working with the NetVLAD model, then they used 10,000 images each for training, validation and test sets. Link: https://www.di.ens.fr/willow/research/netvlad/ [All dataset specifications](Link : https://www.di.ens.fr/willow/research/netvlad/data/netvlad_v100_datasets.tar.gz) (2 MB): Matlab structures that define the datasets, e.g. define train/validation/test splits, GPS coordinates of all points, time stamps for Tokyo Time Machine, Pittsburg250k and Pittsburg30k

Nanne commented 2 years ago

Code in this repo was designed to directly load the .mat file with python: https://github.com/Nanne/pytorch-NetVlad/blob/master/pittsburgh.py#L73

Rechal0703 commented 2 years ago

Code in this repo was designed to directly load the .mat file with python: https://github.com/Nanne/pytorch-NetVlad/blob/master/pittsburgh.py#L73

Wow I get it, thank you very much!

Rechal0703 commented 2 years ago

@Rechal0703 Hey! As mentioned by Relja, there is no separate prepackaged archive available for Pittsburg30k but you can download the dataset configurations in the form of .mat files, convert them to .pickle files (if you are using python), read them and then look at which images are used for the Pittsburg30k dataset. As far as I remember, if you are working with the NetVLAD model, then they used 10,000 images each for training, validation and test sets. Link: https://www.di.ens.fr/willow/research/netvlad/ [All dataset specifications](Link : https://www.di.ens.fr/willow/research/netvlad/data/netvlad_v100_datasets.tar.gz) (2 MB): Matlab structures that define the datasets, e.g. define train/validation/test splits, GPS coordinates of all points, time stamps for Tokyo Time Machine, Pittsburg250k and Pittsburg30k

Hey! Thank you for your patience, I think I probably know what to do. : )

Relja commented 3 months ago

Hi @Rechal0703 , can you please delete the dataset links from your comment? We are intentionally not posting them publically but sending them on explicit request.

Nanne commented 3 months ago

Hi @Rechal0703 , can you please delete the dataset links from your comment? We are intentionally not posting them publically but sending them on explicit request.

I hadn't caught this - I will remove any mentions of the dataset link as I come across them!

Relja commented 3 months ago

Thanks @Nanne !