Open hosea7456 opened 2 years ago
Hello,
Thanks for your interest on our work! I tried to locate the problem you post but failed. But I postulate that the error is caused by the new version of pytorch, so I think using pytorch=1.7.0 may helps.
Hope it helps.
Hello,
Thanks for your interest on our work! I tried to locate the problem you post but failed. But I postulate that the error is caused by the new version of pytorch, so I think using pytorch=1.7.0 may helps.
Hope it helps.
Hi, thanks for your advise. I have tried the version of pytorch==1.7.0, the before error was disappeared but another error is appaer:
Traceback (most recent call last):
File "so_run.py", line 51, in
I have no idea at all
Hello As far as I can postulate, it maybe because the training steps exceeds the max steps of the optimizer. You can check it..
Same error here, and I've tried to increase num_steps in so_config.yaml but it didn't work. Could you provide the parameter that you use to train source-only model? Thank you!
Hi, I just solved that several days ago. The error caused by the fixed max number of steps in adjusting learning rate. You can have a check if it's work. Cheers, zx
嗨,我几天前刚刚解决了这个问题。调整学习率时固定的最大步数引起的错误。您可以检查它是否有效。干杯,zx
I also encountered this problem recently, can you elaborate on how to solve it? Thank you very much
嗨,我几天前刚刚解决了这个问题。调整学习率时固定的最大步数引起的错误。您可以检查它是否有效。干杯,zx
I also encountered this problem recently, can you elaborate on how to solve it? Thank you very much
Hi there, Sorry for the late reply. The issue is coming from the incorrect max step during optimizating. Here is my version:
def adjust_learning_rate(optimizer, i_iter, len_loader, args):
lr = lr_poly(args.learning_rate, i_iter, args.epochs*len_loader, args.power)
optimizer.param_groups[0]['lr'] = lr
if len(optimizer.param_groups) > 1:
optimizer.param_groups[1]['lr'] = lr * 10
return lr
Hope this could help.
Zx
Hi, thanks for your great jobs! When I try to train a model, there was an error like that:
Traceback (most recent call last): File "so_run.py", line 51, in
main()
File "so_run.py", line 43, in main
trainer.train()
File "/home/CCM/trainer/source_only_trainer.py", line 58, in train
self.optim.step()
File /home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper
return func(*args, *kwargs)
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(args, **kwargs)
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/sgd.py", line 110, in step
F.sgd(params_with_grad,
File "/home/anaconda3/envs/torch1.9/lib/python3.8/site-packages/torch/optim/functional.py", line 180, in sgd
param.add(d_p, alpha=-lr)
RuntimeError: For non-complex input tensors, argument alpha must not be a complex number.
How should I fix it? Thank you. And my config used to train is:
note: 'train'
configs of data
model: 'deeplab' train: True multigpu: False fixbn: True fix_seed: True
Optimizaers
learning_rate: 7.5e-5 num_steps: 5000 epochs: 2 weight_decay: 0.0005 momentum: 0.9 power: 0.9 round: 6
Logging
print_freq: 1 save_freq: 2000 tensorboard: False neptune: False screen: True val: False val_freq: 300
Dataset
source: 'gta5' target: 'cityscapes' worker: 0 batch_size: 2
Transforms
input_src: 720 input_tgt: 720 crop_src: 600 crop_tgt: 600 mirror: True scale_min: 0.5 scale_max: 1.5 rec: False
Model hypers
init_weight: './pretrained/DeepLab_resnet_pretrained_init-f81d91e8.pth' restore_from: None
snapshot: './Data/snapshot/' result: './miou_result/' log: './log/' plabel: './plabel' gta5: { data_dir: '/home/data/datasets/GTA5/', data_list: './dataset/list/gta5_list.txt', input_size: [1280, 720] } synthia: { data_dir: '/home/guangrui/data/synthia/', data_list: './dataset/list/synthia_list.txt', input_size: [1280, 760] } cityscapes: { data_dir: '/home/data/datasets/Cityscapes', data_list: './dataset/list/cityscapes_train.txt', input_size: [1024, 512] }