QVPR / VPRTempo

Code for VPRTempo, our temporally encoded spiking neural network for visual place recognition.
https://vprtempo.github.io
MIT License
37 stars 7 forks source link

Help request to reproduce the Table 2 results #15

Closed berthaSZ closed 1 week ago

berthaSZ commented 1 month ago

Hi @AdamDHines,

I was trying to reproduce Table 2 results in your paper. Would you share the code that generated the VPRSNN results? I previously asked Somayeh's help here a month ago. However, I have not heard back from her yet.

I would like to ask your help for reproducing VPRTempo's results as well. I started with working on the Nordland data first. I downloaded the data and organized the directories accordingly. This is my input to the terminal.

python main.py --train_new_model --max_module 3300 --database_places 3300

It does not throw any errors during the training. However, it looks like it is terminating the training process prematurely.

adamq1

When I try to test the trained model by using the following command in the terminal

python main.py --max_module 3300 --database_places 3300

I get the error below

Click to expand
Initializing modules: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.48it/s]
Model name: springfall_VPRTempo_IN3136_FN6272_DB3300.pth
Running the test network:  70%|█████████████████████████████████████████████████████████████████████████████████████████████████▍                                          | 348/500 [00:01<00:00, 212.59it/s]
Traceback (most recent call last): File "main.py", line 269, in parse_network(use_quantize=False, File "main.py", line 265, in parse_network initialize_and_run_model(args,dims) File "main.py", line 200, in initialize_and_run_model run_inference(models, model_name) File "/home/bertha/Documents/VPRTempo/vprtempo/VPRTempo.py", line 320, in run_inference model.evaluate(models, test_loader) File "/home/bertha/Documents/VPRTempo/vprtempo/VPRTempo.py", line 144, in evaluate for spikes, label in test_loader: File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 631, in __next__ data = self._next_data() File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1326, in _next_data return self._process_data(data) File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data data.reraise() File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/_utils.py", line 705, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 5. Original Traceback (most recent call last): File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = self.dataset.__getitems__(possibly_batched_index) File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 419, in __getitems__ return [self.dataset[self.indices[idx]] for idx in indices] File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 419, in return [self.dataset[self.indices[idx]] for idx in indices] File "/home/bertha/Documents/VPRTempo/vprtempo/src/dataset.py", line 195, in __getitem__ img_path = self.img_labels.iloc[idx]['file_path'] File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/pandas/core/indexing.py", line 1103, in __getitem__ return self._getitem_axis(maybe_callable, axis=axis) File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/pandas/core/indexing.py", line 1656, in _getitem_axis self._validate_integer(key, axis) File "/home/bertha/.pyenv/versions/vprtempo_env/lib/python3.8/site-packages/pandas/core/indexing.py", line 1589, in _validate_integer raise IndexError("single positional indexer is out-of-bounds") IndexError: single positional indexer is out-of-bounds
Running the test network:  71%|███████████████████████████                                     | 357/500 [00:01<00:00, 191.34it/s]

I would appreciate it if you could help me.

Thanks in advance.

Best, Bertha

AdamDHines commented 1 month ago

Hi @berthaSZ,

Thanks for your interest in our work. Before debugging, I'll need a little clarification on some things. First, which version of the network are you using (v1.1.5 being the latest, noting that v1.0.0 is the version used in our paper so the results will differ if using the newer version). Second, which Nordland dataset did you download, as there are two? The paper used the full resolution images, not the downsampled ones.

Training or inferencing terminating before the end usually indicates that the number of database places and the filter (or skip value) exceeds the available number of images in your directory. So with a default filter value of 8 and 3300 database places, you need at least 26,400 images in your folder. The full resolution Nordland set contains more images than the downsampled ones, so that's why I ask. This would also likely explain your error for out of bounds indexing.

If you have downloaded the full resolution images, I highly recommend you to unzip the images using the vprtempo/src/nordland.py script which will automatically unzip and organise the 4 traverses from section 1 and 2 together into the one folder.

Thanks, Adam

berthaSZ commented 1 month ago

Hello @AdamDHines,

I am using v.1.1.5. I downloaded the Nordland data via the link shared by Somayeh at here. It contains 35,768 full-resolution images for each season.

I followed your Nordland data preparation instructions as well. I downloaded the original .zip files (of the full-resolution images) via the link you shared in your repository. I then used vprtempo/src/nordland.py to unzip and process the images. However, I encountered the following error. I noticed that each folder (spring, summer, winter, and fall folders organized by nordland.py) contains 24,570 files.

Click to expand
Initializing modules: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.69s/it]
Model name: springfall_VPRTempo_IN3136_FN6272_DB3300.pth Training layer: feature_layer
Module 1:   0%|                                                                                                                                 | 0/26400
Traceback (most recent call last): File "main.py", line 269, in parse_network(use_quantize=False, File "main.py", line 265, in parse_network initialize_and_run_model(args,dims) File "main.py", line 144, in initialize_and_run_model train_new_model(models, model_name) File "/home/bertha/Documents/VPRTempo/vprtempo/VPRTempoTrain.py", line 292, in train_new_model model.train_model(train_loader, layer, model, i, prev_layers=trained_layers) File "/home/bertha/Documents/VPRTempo/vprtempo/VPRTempoTrain.py", line 165, in train_model for spikes, labels in train_loader: File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 631, in __next__ data = self._next_data() File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data return self._process_data(data) File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data data.reraise() File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/_utils.py", line 705, in reraise raise exception FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) # type: ignore[possibly-undefined] File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/bertha/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/bertha/Documents/VPRTempo/vprtempo/src/dataset.py", line 197, in __getitem__ raise FileNotFoundError(f"No file found for index {idx} at {img_path}.") FileNotFoundError: No file found for index 2087 at ./vprtempo/dataset/spring/images-27264.png.
Module 1:   0%|                                                                                                                                 | 0/26400 [00:00

Are you planning to release a generalized framework or more detailed instructions for importing custom datasets? That would be very useful for the SNN-VPR community.

I noticed minor typos in the code causing other errors. I will not list them here, but you can find them easily by running the code from a fresh clone of your GitHub repository.

I am also particularly curious about how you obtained the R@1 values for Somayeh's VPRSNN. The results I am getting are nowhere near 53% for Nordland and 40% for Oxford RobotCar data. I would appreciate it if you could share your implementation of VPRSNN or help me find out what I am doing wrong. Feel free to check the issue I posted in Somayeh's repository at here.

Best, Bertha

AdamDHines commented 1 month ago

Thanks for the clarification, @berthaSZ.

I believe I found the source of the issue. The ./vprtempo/dataset/nordland.csv file is missing the first ~7000 image names in the index. This was because we only compared the precision/recall for 2700 places to VPRSNN, since the first 20% of the 3300 images in that model are used for spike calibration and are discarded from precision metrics. This is the same for the Oxford Robot Car. For now I will keep it the way it is, just to keep in line with what the paper presents but I have added some clarification in the README about this.

The error posted means that the image is not found in the directory, if there only 24,570 files then images-27264.png won't be in the folder. I'm not sure what's happening with the Nordland data from the link in the README, I'll have to investigate later as I'm a bit time poor at the moment. For now please just use the dataset downloaded from the link Somayeh shared.

I have updated the repository to include a script ./vprtempo/src/create_data_csv.py which allows users to create the .csv file required to train new networks and inference them on new datasets. Please take a look at the udpated README for usage. You can use this to generate a new nordland.csv if you wish to train and test the full 3300 images.

What errors are you referring to from "minor typos"? I have downloaded a fresh copy of the repo and cannot find any errors, except for a missing ./vprtempo/output folder which I have now corrected to create the folder if it doesn't exist first. I have no issue training or inferencing new models or using the pretrained weights.

We used pretrained weights for our comparison to VPRSNN, as we were more interested in the speed of training and inferencing than precision/recall metrics. I won't be much help in training new models for VPRSNN I'm afraid.

Thanks, Adam

berthaSZ commented 1 week ago

Hello @AdamDHines,

Thanks to your help, I am able to run your code.

In my case, it was much faster to train VPRTempo on the Nordland data. It took approximately ~15 minutes to train, and I achieved 64% R@1 on Nordland at 150-200 Hz inference frequency. However, the place recognition performance of VPRTempo was much worse on the Oxford data, where I obtained 14% R@1 at the same inference frequency. Additionally, it took 8 hours to train VPRTempo on the Oxford data due to using a larger set (over 60k training images and over 10k test images covering 300 places) of 2014 trajectory images. For both training and inference, I used an NVIDIA RTX 2080 GPU.

Training VPRSNN on the larger Oxford data was not feasible, as it is not as compute-efficient as your method. Therefore, I tried training on a smaller subset of Oxford data (approximately 2000 training and 1000 test images) covering 200 places. VPRSNN achieved ~12% R@1. Despite using parallel CPU computing, it took 1.5 days to train. On the other hand, VPRTempo took 9 minutes (with the RTX 2080 GPU) to train on this smaller Oxford data and achieved ~29% R@1 at 150-200 Hz inference frequency. Overall, if I did not misimplement anything, VPRTempo consistently demonstrates better performance than VPRSNN.

If you can share the original image names of the Nordland and Oxford data you used, it would be helpful for better reproducibility.

Thank you for taking the time to help me understand your work better!

Best, Bertha

AdamDHines commented 1 week ago

Hi @berthaSZ,

Good to hear you've been able to get it to work. Just note, the improved performance you're seeing is likely due to the change in image downsampling and model architecture from what is presented in the paper. In v1.1.5, we use a 56x56 size input image with a patch size of 15 which is double what we used in the paper and what VPRSNN uses. We use a larger model because we can afford it with the low training and fast inferencing times. For the paper, we wanted an apples to apples comparison so just FYI.

The image names we used for ORC can be found in ./vprtempo/dataset/orc.csv, again this list has a truncated amount of places to account for the 20% used for VPRSNN's calibration. I would say your loss of accuracy is because you're training it on the full 60k training images. Without modifying the learning rate to be significantly smaller, you'll be overfitting the model a lot. VPRTempo works best on a single training example repeated for a 4-5 epochs rather than a variety of examples. It's more important that the reference and query is as close in viewpoint than anything, so please try ORC again with just one example dataset of the images you wish to train. Keep the filter to 8 and use the .csv file in the repo and you should be good.

Thanks, Adam

bobhagen commented 1 week ago

Hello @AdamDHines ,

I can confirm the numbers Bertha provided. I also obtained significantly lower performance results for VPRTempo and particularly for VPRSNN on the Oxford data.

I think @berthaSZ is referring to the Oxford RobotCar's image names in the original dataset naming convention, which follows the format "2015-10-29-12-18-17/stereo/right/UNIXTIMESTAMP.png". I also asked Somayeh about this here, but she ignored my requests for both the Oxford processing script and the original image names. The image names in ./vprtempo/dataset/orc.csv are not useful, as you haven’t shared your version of the aligned Oxford data. This is understandable due to redistribution restrictions. That is why I previously requested either the processing script or the image names in the original naming convention. I believe @berthaSZ is also asking for the original names for the same reason.

Matching the input image resolution does not necessarily ensure an apples-to-apples comparison. In my opinion, it is just another important model parameter. If a model performs worse with decreasing input resolution, it is simply a characteristic of the model.

I disagree with your comment about overfitting. The full training images are of unique instances and should actually help prevent overfitting. You mentioned that "It's more important that the reference and query are as close in viewpoint as possible." Wouldn't this lead to overfitting to such cherry-picked images? Wouldn't one want variety in the training and test data to train models that are more robust to real-world challenges? Of course, such closeness between the reference and query will yield significantly better results.

Best, Robert

AdamDHines commented 1 week ago

@berthaSZ @bobhagen Timestamped image names are now available in the ./vprtempo/dataset/ folder which contains the corresponding timestamps for each rain, dusk, and sun traversal. Please see the updated README.md for Oxford RobotCar to process raw images for training and testing. VPRTempo has a new release for v1.1.7, please download the latest code.

Following the protocol from the README, I have been able to replicate the results in the paper (see below):

Screenshot from 2024-08-15 13-56-37

You can more easily train and test like the paper, with the addition of a --skip argument that allows you to ignore the first n images in the .csv file. In this case, we train on 450 places and query 360. To train and test your own, use this:

# Train the ORC model
python main.py --train_new_model --database_places 450 --database_dirs sun,rain --skip 0 --max_module 450 --dataset orc --dims 28,28 --patches 7 --filter 7

# Test the ORC model
python main.py --database_places 450 --database_dirs sun,rain --skip 630 --dataset orc --dims 28,28 --patches 7 --filter 7 --query_dir dusk --query_places 360 --sim_mat --max_module 450