Closed Steven-jiaqi closed 1 year ago
Hello, thank you for your interest.
the get_all_datasets_path is an old function that I had setup with paths on my machine. You're right,it doesn't work in the repo; you have to set the path in your own machine. I will push a commit briefly adding these comments
The directory tree, that can be obtained starting from the original version, using the msls/1_reformat_mapillary.py
script, is the following:
msls_reformat ├── test │ ├── database │ └── queries ├── train │ ├── database │ └── queries └── val ├── database └── queries
Inside database and queries fodler It then contains, rather than cities as original MSLS, a folder per each sequence_id.
We had to reformat the dataset in this way because the original code from the MSLS authors contained some bugs regarding sequence creation, so we had to reformat the code and this tree structure made it easier.
However the pre-processing to build the sequences is quite unefficient (takes 2-3 hours), so you see the third script msls/3_cache_dataset.py
which is used to create a cache object that can be computed once and loaded efficiently during training with the arg --cached_train_dataset cache_path.pth
.
I am also making some changes to that, making automatic caching in the dataset without having to do it manually, so check out my next commits. I will also update the Readme to clarify everything
Thank you for your timely reply! So the path is the original msls dataset or other paths. Because i would like to reproduce this great work of yours as soon as possible,i am very sorry to keep bothering you.Please tell me a definite directory. Best wisher to you!
Hello, I made a first commit in which I added command line args and comments to explain the parameters needed.
After downloading the original MSLS dataset, you have to run python main_scripts/msls/1_reformat_mapillary.py original_msls_folder destination_folder
. You can use the -h option to read the description of the args. This will reformat the tree structure. By default it will create a copy of the dataset; it can be sped up if you pass --delete_old
, which will move rather than copy.
After that, you can run python main_scripts/msls/2_reformat_testset_msls.py msls_reformat_folder
. This script will simply create a val and test split as we propose in our paper.
As for the third script in the folder, I will remove it soon replacing it with automatic caching in the dataset
tell me if you have any more problems
I have made also the commit that introduces automatic dataset caching. You can check out in the README more details.
happy to help if you have other issues
Thanks for your prompt reply, I've had some problems with my server recently and it's just now starting to work. When I run it I found that the MSLS dataset is missing the folder msls/train_val/amsterdam/database/images, I download the dataset to make sure it is all downloaded, I was wondering if you have this file in your dataset? I am sorry to bother you again.
Yes I have it. all cities have database/images and query/images However I don't think I can share that, due to the MSLS license, and you should try to download it from the official website https://www.mapillary.com/dataset/places, after registration
Thank you for providing the URL! I re-downloaded it and then it was complete.
I want to know the python main_scripts/main_train.py \
--dataset_path
Hello, happy that you managed to get the dataset. Throughout the paper all experiments are run on the re-formatted dataset and so whenever we refer to dataset path it is always to the reformat version. nb: there is a typo in the command that you listed, the LR should be 1e-5.
Thank you for the tip!
With your help, I have successfully run the cct+seqvlad experiment. But when I was running the Timesformer experiment, I found some errors: AttributeError: ‘Namespace' object has no attribute 'features_dim' It happens in "vg-transformers/tgv/models/tgv_net.py" line 43 self.meta = {'outputdim': args.features_dim} And i can not find the 'args.features_dim' variant in the code project. Should I assign a value to this parameter in the parser.py?
in main_scripts/reformat_testset.py 51 from tvg.utils import get_all_datasets_path 52 DS_PATH = get_all_datasets_path() The get_all_datasets_path file is null because I can not find the get_all_dataset_path file,so I can not get the 'DS_PATH'.Do you know how to solve this problem? And i want to know the msls dataset directory tree that you use in this project. I'm sorry to bother you.Best wishes to you!