Closed zeng32 closed 4 years ago
Hey, it seems that it is just last epoch result, where model could be overfit until then. In log file you can inspect the situation better.
Thanks for the reply. I find it interesting that if I set --gpu_ids 0,1,2,3, then 4 GPUs are utilized, but the IoU result is bad:18%. But with the default setting(only on GPU 0), the result could be as high as 35%. Would you please help me figure out why changing in gpu_ids effects the result?
Hi, have you tried different learning or a smaller batch size? SGD-based optimization schemes are sensitive to hyper-params for multi-gpu trainings.
Right now my options are:
batch_size: 4
lr: 0.001
lr_decay_epochs: 25
lr_decay_iters: 5000000
lr_gamma: 0.9
lr_policy: lambda
I'll try different settings.
Thanks again!
Hi,
When I run the code successfully(with the required environment), below is the result on SUNRGBD dataset: epoch 400 test loss: 3.094 glob acc : 64.37, mean acc : 25.19, IoU : 18.14 It's far from the posted results: 75.4% | 46.48% | 35.69%
So, would you please enlighten me on how can I improve the result?
BTW, below is the command I use with 4 RTX 2070 GPUs: python train.py --dataroot datasets/sunrgbd --dataset sunrgbd --name sunrgbd --gpu_ids 0,1,2,3 --display_id 1 | tee ./run.log