Help about parameter setting

Jeff-Zilence / TransGeo2022

Official repository for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization

MIT License

116 stars 22 forks source link

Help about parameter setting #27

Closed wlfxy closed 10 months ago

wlfxy commented 11 months ago

Hello, would you like to know that when training cvusa, a gpu is used, gou is set to 1,lr is set to 0.0001, batch-size is set to 32, did-URL is set to 'tcp://localhost:10001' and world-size is set to 1. rank set to 0, epochs set to 100, op set to sam, wd set to 0.03, dataset set to cvusa, cos set to True,dim set to 1000, asam set to True, rho set to 2.5. But the result of the first stage is very bad, I would like to ask if I made a mistake, I took a screenshot of the specific parameter Settings, thank you ![Uploading 屏幕截图 2023-11-23 231824.png…]()

Jeff-Zilence commented 11 months ago

The image is not uploaded correctly, so I can not see the parameters. Why not directly run the scripts following the instructions?

wlfxy commented 11 months ago

Since I only have one gpu, running sh run_CVUSA.sh directly seems to require multiple Gpus, and when I run sh run_cvusa.sh directly, I get an error. The error is work = _default_pg.barrier(). RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1607370172916/work/torch/lib/c10d/ProcessGroupNCCL.cpp:784, unhandled system error, NCCL version 2.7.8

wlfxy commented 11 months ago

You may need to specify the GPUs for training in "train.py". Remove the second line if you want to train the simple stage-1 model. Change the "--dataset" to train on other datasets. The code follows the multiprocessing distributed training style from PyTorch and Moco, but it only uses one GPU by default for training，The readme paragraph should mean using a single GPU parameter, but the content of the command line should run in a distributed manner with multiple Gpus

Jeff-Zilence commented 10 months ago

It does not require multiple GPUs. If the train.py, we use os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" and os.environ["CUDA_VISIBLE_DEVICES"] = "0" to assign one GPU.

mksasx commented 10 months ago

hello？I want to ask some questions!!!!