Questions about performance during training

gmberton / CosPlace

Official code for CVPR 2022 paper "Rethinking Visual Geo-localization for Large-Scale Applications"

MIT License

299 stars 58 forks source link

Questions about performance during training #41

Closed Devoe-97 closed 12 months ago

Devoe-97 commented 12 months ago

Thank you for your excellent work! I am very impressed with your SF-XL dataset! However, the dataset is too large and bad network makes it difficult for me to download the full dataset. (This problem may only be encountered by Chinese scholars.) Therefore, I used the SMALL set for training and validation and my configuration is as follows: M: 10 alpha: 30 N: 5 L: 2 groups_num: 4 I encountered some confusion:

When the batch size is equal to 32, I find that the model's performance on the validation set is trending up until the epcoh is less than or equal to 25, and then the performance on the validation set is trending down as training proceeds. Do you have any idea what is the reason for this and how to fix the problem?
I also tried setting the batch size to 64 and 128 (with the iterations_per_epochset to 10,000), and found that the downward trend in model performance appeared earlier for epoch 20 (bs=64) and epcoh8 (bs=128). Have you explored the effect of different batch sizes on performance?

I would greatly appreciate it if you could help me with the above issues.

gmberton commented 12 months ago

Hello, note that that SF-XL README says that

3) small: this is a small curated subset of processed which allows you to quickly get started and is only 4.8 GB heavy. Obviously, results won't be as good as when using the processed version, but should be good enough. There is a train, val and test set. The train set is only from 1 group, obtained with L=12. To train on this dataset, you should do $ python train.py --groups_num 1. So training with other configurations, like changing values of M, alpha, N, L, and groups_num will produce unpredictable results.

Devoe-97 commented 12 months ago

Thanks, I'll try again.

Devoe-97 commented 9 months ago

Hello, note that that SF-XL README says that

small: this is a small curated subset of processed which allows you to quickly get started and is only 4.8 GB heavy. Obviously, results won't be as good as when using the processed version, but should be good enough. There is a train, val and test set. The train set is only from 1 group, obtained with L=12. To train on this dataset, you should do $ python train.py --groups_num 1. So training with other configurations, like changing values of M, alpha, N, L, and groups_num will produce unpredictable results.

Would it be correct to use the following parameters when training with PROCESSED data? M: 10 alpha: 30 N: 5 L: 2 groups_num: 4

Devoe-97 commented 9 months ago

Hello, note that that SF-XL README says that

small: this is a small curated subset of processed which allows you to quickly get started and is only 4.8 GB heavy. Obviously, results won't be as good as when using the processed version, but should be good enough. There is a train, val and test set. The train set is only from 1 group, obtained with L=12. To train on this dataset, you should do $ python train.py --groups_num 1. So training with other configurations, like changing values of M, alpha, N, L, and groups_num will produce unpredictable results.

Would it be correct to use the following parameters when training with PROCESSED data? M: 10 alpha: 30 N: 5 L: 2 groups_num: 4

Correction: groups_num=8